diff mbox series

[mm/cma] mm/cma: retry allocation of dedicated area on EBUSY

Message ID 20230419083851.2555096-1-sergii.piatakov@globallogic.com (mailing list archive)
State New
Headers show
Series [mm/cma] mm/cma: retry allocation of dedicated area on EBUSY | expand

Commit Message

Sergii Piatakov April 19, 2023, 8:38 a.m. UTC
Sometimes continuous page range can't be successfully allocated, because
some pages in the range may not pass the isolation test. In this case,
the CMA allocator gets an EBUSY error and retries allocation again (in
the slightly shifted range). During this procedure, a user may see
messages like:
    alloc_contig_range: [70000, 80000) PFNs busy
But in most cases, everything will be OK, because isolation test failure
is a recoverable issue and the CMA allocator takes care of it (retrying
allocation again and again).

This approach works well while a small piece of memory is allocated from
a big CMA region. But there are cases when the caller needs to allocate
the entire CMA region at once.

For example, when a module requires a lot of CMA memory and a region
with the requested size is binded to the module in the DTS file. When
the module tries to allocate the entire its own region at once and the
isolation test fails, the situation will be different than usual due to
the following:
 - it is not possible to allocate pages in another range from the CMA
   region (because the module requires the whole range from the
   beginning to the end);
 - the module (from the client's point of view) doesn't expect its
   request will be rejected (because it has its own dedicated CMA region
   declared in the DTS).

This issue should be handled on the CMA allocator layer as this is the
lowest layer when the reason for failure can be distinguished. Because
the allocator doesn't return an error code, but instead it just returns
a pointer (to a page structure). And when the caller gets a NULL it
can't realize what kind of problem happens inside (EBUSY, ENOMEM, or
something else).

To avoid cases when CMA region has enough room to allocate the requested
pages, but returns NULL due to failed isolation test it is proposed:
 - add a separate branch to handle cases when the entire region is
   requested;
 - as an initial solution, retry allocation several times (in the setup
   where the issue was observed this solution helps).

Signed-off-by: Sergii Piatakov <sergii.piatakov@globallogic.com>
---
 mm/cma.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

Comments

Minchan Kim May 4, 2023, 4:30 p.m. UTC | #1
On Wed, Apr 19, 2023 at 11:38:51AM +0300, Sergii Piatakov wrote:
> Sometimes continuous page range can't be successfully allocated, because
> some pages in the range may not pass the isolation test. In this case,
> the CMA allocator gets an EBUSY error and retries allocation again (in
> the slightly shifted range). During this procedure, a user may see
> messages like:
>     alloc_contig_range: [70000, 80000) PFNs busy
> But in most cases, everything will be OK, because isolation test failure
> is a recoverable issue and the CMA allocator takes care of it (retrying
> allocation again and again).
> 
> This approach works well while a small piece of memory is allocated from
> a big CMA region. But there are cases when the caller needs to allocate
> the entire CMA region at once.

I agree that's valid use case.

> 
> For example, when a module requires a lot of CMA memory and a region
> with the requested size is binded to the module in the DTS file. When
> the module tries to allocate the entire its own region at once and the
> isolation test fails, the situation will be different than usual due to
> the following:
>  - it is not possible to allocate pages in another range from the CMA
>    region (because the module requires the whole range from the
>    beginning to the end);
>  - the module (from the client's point of view) doesn't expect its
>    request will be rejected (because it has its own dedicated CMA region
>    declared in the DTS).

That's out of expectation. Every CMA client should expect that CMA
allocation can be failed since there are a lot of reasons CMA can fail.

> 
> This issue should be handled on the CMA allocator layer as this is the
> lowest layer when the reason for failure can be distinguished. Because
> the allocator doesn't return an error code, but instead it just returns
> a pointer (to a page structure). And when the caller gets a NULL it
> can't realize what kind of problem happens inside (EBUSY, ENOMEM, or
> something else).
> 
> To avoid cases when CMA region has enough room to allocate the requested
> pages, but returns NULL due to failed isolation test it is proposed:
>  - add a separate branch to handle cases when the entire region is
>    requested;

Can't we also consider the request size is greater than half the size of
CMA as well if we want to go this approach?

Furthermore, what happens if the CMA is shared with others and remains
free memory up to only the requested size? In the case, it also returns
without further retrial(I am thinking how we can generalize if we want
to add retrial option to increase success ratio not only entire range
request but also other cases).

>  - as an initial solution, retry allocation several times (in the setup
>    where the issue was observed this solution helps).

At a quick look, I think the CMA client need to handle the failure.
If they request entire range, they should try harder(e.g., multiple attempts)
(Just FYI, folks had tried such a retry option multiple times even though
it was not entire range request since CMA allocation is fragile)

> 
> Signed-off-by: Sergii Piatakov <sergii.piatakov@globallogic.com>
> ---
>  mm/cma.c | 23 +++++++++++++++++++++--
>  1 file changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/cma.c b/mm/cma.c
> index a7263aa02c92..37e2bc34391b 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -431,6 +431,7 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
>  	unsigned long i;
>  	struct page *page = NULL;
>  	int ret = -ENOMEM;
> +	int retry = 0;
>  
>  	if (!cma || !cma->count || !cma->bitmap)
>  		goto out;
> @@ -487,8 +488,26 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
>  
>  		trace_cma_alloc_busy_retry(cma->name, pfn, pfn_to_page(pfn),
>  					   count, align);
> -		/* try again with a bit different memory target */
> -		start = bitmap_no + mask + 1;
> +
> +		/*
> +		 * The region has enough free space, but it can't be provided right now
> +		 * because the underlying layer is busy and can't perform allocation.
> +		 * Here we have different options depending on each particular case.
> +		 */
> +
> +		if (!start && !offset && bitmap_maxno == bitmap_count) {
> +			/*
> +			 * If the whole region is requested it means that:
> +			 *  - there is no room to retry the allocation in another range;
> +			 *  - most likely somebody tries to allocate a dedicated CMA region.
> +			 * So in this case we can just retry allocation several times with the
> +			 * same parameters.
> +			 */
> +			if (retry++ >= 5/*maxretry*/)
> +				break;
> +		} else
> +			/* In other cases try again with a bit different memory target */
> +			start = bitmap_no + mask + 1;
>  	}
>  
>  	trace_cma_alloc_finish(cma->name, pfn, page, count, align, ret);
> -- 
> 2.25.1
> 
>
Sergii Piatakov May 6, 2023, 3:46 p.m. UTC | #2
> That's out of expectation. Every CMA client should expect that CMA
> allocation can be failed since there are a lot of reasons CMA can fail.

Understood, thank you for the clarification!

> Can't we also consider the request size is greater than half the size of
> CMA as well if we want to go this approach?

Actually, my original intention was to introduce retrying only for cases
when the whole region is requested.

But I agree that potentially could be several branches with optimal handling
for some specific cases and one fallback branch with a generic approach.
I tried to emphasize this idea in the following comment:
> > +                * Here we have different options depending on each
particular case.

> Furthermore, what happens if the CMA is shared with others and remains
> free memory up to only the requested size? In the case, it also returns
> without further retrial

I think that such cases could be covered in a dedicated branch (I mean
if-else
branch).

> I am thinking how we can generalize if we want
> to add retrial option to increase success ratio not only entire range
> request but also other cases

By the way, based on my personal observation, moving the requested pages
range may potentially reduce the success ratio for cases when allocation
fails
due to isolation tests.
This is because the pages are updated in the direction from the lower
indexes
to higher ones. And if page number N doesn't fit isolation requirements, it
is
likely that page N+1 doesn't fit the requirements too. Moreover, page N+1
will
be updated and pass the isolation test later than page N!
So moving the requested pages range in the same direction (from lower to
higher indexes) may reduce the success ratio!
Per my understanding, if allocation fails due to an isolation test it would
be better
to request again the same region without any shift!

Please keep in mind, that my experience is based on one particular use case,
so I may be wrong!

> At a quick look, I think the CMA client need to handle the failure.
> If they request entire range, they should try harder(e.g., multiple
attempts)
> (Just FYI, folks had tried such a retry option multiple times even though
> it was not entire range request since CMA allocation is fragile)

Thank you for providing this comment, I really appreciate it!
I understood that it is not guaranteed that CMA is allocated from the first
attempt,
so a module should retry allocation by itself! We will apply the suggested
approach!

Just one comment from my side. In my opinion, retrying allocation by a
module
would be a perfect solution, if the module knows the exact reason why the
allocation
fails (EBUSY, ENOMEM, etc). Based on the actual error code the module may
choose a proper handling for each particular case. Without knowing the
exact error code
(but only having a NULL pointer), retrying looks like a kind of workaround
rather than
a proper solution.

On Thu, May 4, 2023 at 7:30 PM Minchan Kim <minchan@kernel.org> wrote:

> On Wed, Apr 19, 2023 at 11:38:51AM +0300, Sergii Piatakov wrote:
> > Sometimes continuous page range can't be successfully allocated, because
> > some pages in the range may not pass the isolation test. In this case,
> > the CMA allocator gets an EBUSY error and retries allocation again (in
> > the slightly shifted range). During this procedure, a user may see
> > messages like:
> >     alloc_contig_range: [70000, 80000) PFNs busy
> > But in most cases, everything will be OK, because isolation test failure
> > is a recoverable issue and the CMA allocator takes care of it (retrying
> > allocation again and again).
> >
> > This approach works well while a small piece of memory is allocated from
> > a big CMA region. But there are cases when the caller needs to allocate
> > the entire CMA region at once.
>
> I agree that's valid use case.
>
> >
> > For example, when a module requires a lot of CMA memory and a region
> > with the requested size is binded to the module in the DTS file. When
> > the module tries to allocate the entire its own region at once and the
> > isolation test fails, the situation will be different than usual due to
> > the following:
> >  - it is not possible to allocate pages in another range from the CMA
> >    region (because the module requires the whole range from the
> >    beginning to the end);
> >  - the module (from the client's point of view) doesn't expect its
> >    request will be rejected (because it has its own dedicated CMA region
> >    declared in the DTS).
>
> That's out of expectation. Every CMA client should expect that CMA
> allocation can be failed since there are a lot of reasons CMA can fail.
>
> >
> > This issue should be handled on the CMA allocator layer as this is the
> > lowest layer when the reason for failure can be distinguished. Because
> > the allocator doesn't return an error code, but instead it just returns
> > a pointer (to a page structure). And when the caller gets a NULL it
> > can't realize what kind of problem happens inside (EBUSY, ENOMEM, or
> > something else).
> >
> > To avoid cases when CMA region has enough room to allocate the requested
> > pages, but returns NULL due to failed isolation test it is proposed:
> >  - add a separate branch to handle cases when the entire region is
> >    requested;
>
> Can't we also consider the request size is greater than half the size of
> CMA as well if we want to go this approach?
>
> Furthermore, what happens if the CMA is shared with others and remains
> free memory up to only the requested size? In the case, it also returns
> without further retrial(I am thinking how we can generalize if we want
> to add retrial option to increase success ratio not only entire range
> request but also other cases).
>
> >  - as an initial solution, retry allocation several times (in the setup
> >    where the issue was observed this solution helps).
>
> At a quick look, I think the CMA client need to handle the failure.
> If they request entire range, they should try harder(e.g., multiple
> attempts)
> (Just FYI, folks had tried such a retry option multiple times even though
> it was not entire range request since CMA allocation is fragile)
>
> >
> > Signed-off-by: Sergii Piatakov <sergii.piatakov@globallogic.com>
> > ---
> >  mm/cma.c | 23 +++++++++++++++++++++--
> >  1 file changed, 21 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/cma.c b/mm/cma.c
> > index a7263aa02c92..37e2bc34391b 100644
> > --- a/mm/cma.c
> > +++ b/mm/cma.c
> > @@ -431,6 +431,7 @@ struct page *cma_alloc(struct cma *cma, unsigned
> long count,
> >       unsigned long i;
> >       struct page *page = NULL;
> >       int ret = -ENOMEM;
> > +     int retry = 0;
> >
> >       if (!cma || !cma->count || !cma->bitmap)
> >               goto out;
> > @@ -487,8 +488,26 @@ struct page *cma_alloc(struct cma *cma, unsigned
> long count,
> >
> >               trace_cma_alloc_busy_retry(cma->name, pfn,
> pfn_to_page(pfn),
> >                                          count, align);
> > -             /* try again with a bit different memory target */
> > -             start = bitmap_no + mask + 1;
> > +
> > +             /*
> > +              * The region has enough free space, but it can't be
> provided right now
> > +              * because the underlying layer is busy and can't perform
> allocation.
> > +              * Here we have different options depending on each
> particular case.
> > +              */
> > +
> > +             if (!start && !offset && bitmap_maxno == bitmap_count) {
> > +                     /*
> > +                      * If the whole region is requested it means that:
> > +                      *  - there is no room to retry the allocation in
> another range;
> > +                      *  - most likely somebody tries to allocate a
> dedicated CMA region.
> > +                      * So in this case we can just retry allocation
> several times with the
> > +                      * same parameters.
> > +                      */
> > +                     if (retry++ >= 5/*maxretry*/)
> > +                             break;
> > +             } else
> > +                     /* In other cases try again with a bit different
> memory target */
> > +                     start = bitmap_no + mask + 1;
> >       }
> >
> >       trace_cma_alloc_finish(cma->name, pfn, page, count, align, ret);
> > --
> > 2.25.1
> >
> >
>
>
diff mbox series

Patch

diff --git a/mm/cma.c b/mm/cma.c
index a7263aa02c92..37e2bc34391b 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -431,6 +431,7 @@  struct page *cma_alloc(struct cma *cma, unsigned long count,
 	unsigned long i;
 	struct page *page = NULL;
 	int ret = -ENOMEM;
+	int retry = 0;
 
 	if (!cma || !cma->count || !cma->bitmap)
 		goto out;
@@ -487,8 +488,26 @@  struct page *cma_alloc(struct cma *cma, unsigned long count,
 
 		trace_cma_alloc_busy_retry(cma->name, pfn, pfn_to_page(pfn),
 					   count, align);
-		/* try again with a bit different memory target */
-		start = bitmap_no + mask + 1;
+
+		/*
+		 * The region has enough free space, but it can't be provided right now
+		 * because the underlying layer is busy and can't perform allocation.
+		 * Here we have different options depending on each particular case.
+		 */
+
+		if (!start && !offset && bitmap_maxno == bitmap_count) {
+			/*
+			 * If the whole region is requested it means that:
+			 *  - there is no room to retry the allocation in another range;
+			 *  - most likely somebody tries to allocate a dedicated CMA region.
+			 * So in this case we can just retry allocation several times with the
+			 * same parameters.
+			 */
+			if (retry++ >= 5/*maxretry*/)
+				break;
+		} else
+			/* In other cases try again with a bit different memory target */
+			start = bitmap_no + mask + 1;
 	}
 
 	trace_cma_alloc_finish(cma->name, pfn, page, count, align, ret);