diff mbox series

[1/6] MM: Support __GFP_NOFAIL in alloc_pages_bulk_*() and improve doco

Message ID 163184741776.29351.3565418361661850328.stgit@noble.brown (mailing list archive)
State New, archived
Headers show
Series congestion_wait() and GFP_NOFAIL | expand

Commit Message

NeilBrown Sept. 17, 2021, 2:56 a.m. UTC
When alloc_pages_bulk_array() is called on an array that is partially
allocated, the level of effort to get a single page is less than when
the array was completely unallocated.  This behaviour is inconsistent,
but now fixed.  One effect if this is that __GFP_NOFAIL will not ensure
at least one page is allocated.

Also clarify the expected success rate.  __alloc_pages_bulk() will
allocated one page according to @gfp, and may allocate more if that can
be done cheaply.  It is assumed that the caller values cheap allocation
where possible and may decide to use what it has got, or to call again
to get more.

Acked-by: Mel Gorman <mgorman@suse.com>
Fixes: 0f87d9d30f21 ("mm/page_alloc: add an array-based interface to the bulk page allocator")
Signed-off-by: NeilBrown <neilb@suse.de>
---
 mm/page_alloc.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Mel Gorman Sept. 17, 2021, 2:42 p.m. UTC | #1
I'm top-posting to cc Jesper with full context of the patch. I don't
have a problem with this patch other than the Fixes: being a bit
marginal, I should have acked as Mel Gorman <mgorman@suse.de> and the
@gfp in the comment should have been @gfp_mask.

However, an assumption the API design made was that it should fail fast
if memory is not quickly available but have at least one page in the
array. I don't think the network use case cares about the situation where
the array is already populated but I'd like Jesper to have the opportunity
to think about it.  It's possible he would prefer it's explicit and the
check becomes
(!nr_populated || ((gfp_mask & __GFP_NOFAIL) && !nr_account)) to
state that __GFP_NOFAIL users are willing to take a potential latency
penalty if the array is already partially populated but !__GFP_NOFAIL
users would prefer fail-fast behaviour. I'm on the fence because while
I wrote the implementation, it was based on other peoples requirements.

On Fri, Sep 17, 2021 at 12:56:57PM +1000, NeilBrown wrote:
> When alloc_pages_bulk_array() is called on an array that is partially
> allocated, the level of effort to get a single page is less than when
> the array was completely unallocated.  This behaviour is inconsistent,
> but now fixed.  One effect if this is that __GFP_NOFAIL will not ensure
> at least one page is allocated.
> 
> Also clarify the expected success rate.  __alloc_pages_bulk() will
> allocated one page according to @gfp, and may allocate more if that can
> be done cheaply.  It is assumed that the caller values cheap allocation
> where possible and may decide to use what it has got, or to call again
> to get more.
> 
> Acked-by: Mel Gorman <mgorman@suse.com>
> Fixes: 0f87d9d30f21 ("mm/page_alloc: add an array-based interface to the bulk page allocator")
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  mm/page_alloc.c |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index b37435c274cf..aa51016e49c5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5191,6 +5191,11 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
>   * is the maximum number of pages that will be stored in the array.
>   *
>   * Returns the number of pages on the list or array.
> + *
> + * At least one page will be allocated if that is possible while
> + * remaining consistent with @gfp.  Extra pages up to the requested
> + * total will be allocated opportunistically when doing so is
> + * significantly cheaper than having the caller repeat the request.
>   */
>  unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
>  			nodemask_t *nodemask, int nr_pages,
> @@ -5292,7 +5297,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
>  								pcp, pcp_list);
>  		if (unlikely(!page)) {
>  			/* Try and get at least one page */
> -			if (!nr_populated)
> +			if (!nr_account)
>  				goto failed_irq;
>  			break;
>  		}
> 
>
NeilBrown Sept. 20, 2021, 11:48 p.m. UTC | #2
On Sat, 18 Sep 2021, Mel Gorman wrote:
> I'm top-posting to cc Jesper with full context of the patch. I don't
> have a problem with this patch other than the Fixes: being a bit
> marginal, I should have acked as Mel Gorman <mgorman@suse.de> and the
> @gfp in the comment should have been @gfp_mask.
> 
> However, an assumption the API design made was that it should fail fast
> if memory is not quickly available but have at least one page in the
> array. I don't think the network use case cares about the situation where
> the array is already populated but I'd like Jesper to have the opportunity
> to think about it.  It's possible he would prefer it's explicit and the
> check becomes
> (!nr_populated || ((gfp_mask & __GFP_NOFAIL) && !nr_account)) to
> state that __GFP_NOFAIL users are willing to take a potential latency
> penalty if the array is already partially populated but !__GFP_NOFAIL
> users would prefer fail-fast behaviour. I'm on the fence because while
> I wrote the implementation, it was based on other peoples requirements.

I can see that it could be desirable to not try too hard when we already
have pages allocated, but maybe the best way to achieve that is for the
called to clear __GFP_RECLAIM in that case.

Alternately, callers that really want the __GFP_RECLAIM and __GFP_NOFAIL
flags to be honoured could ensure that the array passed in is empty.
That wouldn't be difficult (for current callers).

In either case, the documentation should make it clear which flags are
honoured when.

Let's see what Jesper has to say.

Thanks,
NeilBrown


> 
> On Fri, Sep 17, 2021 at 12:56:57PM +1000, NeilBrown wrote:
> > When alloc_pages_bulk_array() is called on an array that is partially
> > allocated, the level of effort to get a single page is less than when
> > the array was completely unallocated.  This behaviour is inconsistent,
> > but now fixed.  One effect if this is that __GFP_NOFAIL will not ensure
> > at least one page is allocated.
> > 
> > Also clarify the expected success rate.  __alloc_pages_bulk() will
> > allocated one page according to @gfp, and may allocate more if that can
> > be done cheaply.  It is assumed that the caller values cheap allocation
> > where possible and may decide to use what it has got, or to call again
> > to get more.
> > 
> > Acked-by: Mel Gorman <mgorman@suse.com>
> > Fixes: 0f87d9d30f21 ("mm/page_alloc: add an array-based interface to the bulk page allocator")
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> >  mm/page_alloc.c |    7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index b37435c274cf..aa51016e49c5 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5191,6 +5191,11 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
> >   * is the maximum number of pages that will be stored in the array.
> >   *
> >   * Returns the number of pages on the list or array.
> > + *
> > + * At least one page will be allocated if that is possible while
> > + * remaining consistent with @gfp.  Extra pages up to the requested
> > + * total will be allocated opportunistically when doing so is
> > + * significantly cheaper than having the caller repeat the request.
> >   */
> >  unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
> >  			nodemask_t *nodemask, int nr_pages,
> > @@ -5292,7 +5297,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
> >  								pcp, pcp_list);
> >  		if (unlikely(!page)) {
> >  			/* Try and get at least one page */
> > -			if (!nr_populated)
> > +			if (!nr_account)
> >  				goto failed_irq;
> >  			break;
> >  		}
> > 
> > 
> 
>
Vlastimil Babka Oct. 5, 2021, 9:16 a.m. UTC | #3
On 9/17/21 16:42, Mel Gorman wrote:
> I'm top-posting to cc Jesper with full context of the patch. I don't
> have a problem with this patch other than the Fixes: being a bit
> marginal, I should have acked as Mel Gorman <mgorman@suse.de> and the
> @gfp in the comment should have been @gfp_mask.
> 
> However, an assumption the API design made was that it should fail fast
> if memory is not quickly available but have at least one page in the
> array. I don't think the network use case cares about the situation where
> the array is already populated but I'd like Jesper to have the opportunity
> to think about it.  It's possible he would prefer it's explicit and the
> check becomes
> (!nr_populated || ((gfp_mask & __GFP_NOFAIL) && !nr_account)) to

Note that AFAICS nr_populated is an incomplete piece of information, as we
initially only count pages in the page_array as nr_populated up to the first
NULL pointer. So even before Neil's patch we could decide to allocate even
if there are pre-existing pages, but placed later in the array. Which could
be rather common if the array consumer starts from index 0? So with Neil's
patch this at least becomes consistent, while the check suggested by Mel
leaves there the weird dependency on where pre-existing pages appear in the
page_array.
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b37435c274cf..aa51016e49c5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5191,6 +5191,11 @@  static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
  * is the maximum number of pages that will be stored in the array.
  *
  * Returns the number of pages on the list or array.
+ *
+ * At least one page will be allocated if that is possible while
+ * remaining consistent with @gfp.  Extra pages up to the requested
+ * total will be allocated opportunistically when doing so is
+ * significantly cheaper than having the caller repeat the request.
  */
 unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
 			nodemask_t *nodemask, int nr_pages,
@@ -5292,7 +5297,7 @@  unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
 								pcp, pcp_list);
 		if (unlikely(!page)) {
 			/* Try and get at least one page */
-			if (!nr_populated)
+			if (!nr_account)
 				goto failed_irq;
 			break;
 		}