diff mbox series

mm, page_alloc: skip ->waternark_boost for atomic order-0 allocations

Message ID 1589882284-21010-1-git-send-email-charante@codeaurora.org (mailing list archive)
State New, archived
Headers show
Series mm, page_alloc: skip ->waternark_boost for atomic order-0 allocations | expand

Commit Message

Charan Teja Kalla May 19, 2020, 9:58 a.m. UTC
When boosting is enabled, it is observed that rate of atomic order-0
allocation failures are high due to the fact that free levels in the
system are checked with ->watermark_boost offset. This is not a problem
for sleepable allocations but for atomic allocations which looks like
regression.

This problem is seen frequently on system setup of Android kernel
running on Snapdragon hardware with 4GB RAM size. When no extfrag event
occurred in the system, ->watermark_boost factor is zero, thus the
watermark configurations in the system are:
   _watermark = (
          [WMARK_MIN] = 1272, --> ~5MB
          [WMARK_LOW] = 9067, --> ~36MB
          [WMARK_HIGH] = 9385), --> ~38MB
   watermark_boost = 0

After launching some memory hungry applications in Android which can
cause extfrag events in the system to an extent that ->watermark_boost
can be set to max i.e. default boost factor makes it to 150% of high
watermark.
   _watermark = (
          [WMARK_MIN] = 1272, --> ~5MB
          [WMARK_LOW] = 9067, --> ~36MB
          [WMARK_HIGH] = 9385), --> ~38MB
   watermark_boost = 14077, -->~57MB

With default system configuration, for an atomic order-0 allocation to
succeed, having free memory of ~2MB will suffice. But boosting makes
the min_wmark to ~61MB thus for an atomic order-0 allocation to be
successful system should have minimum of ~23MB of free memory(from
calculations of zone_watermark_ok(), min = 3/4(min/2)). But failures are
observed despite system is having ~20MB of free memory. In the testing,
this is reproducible as early as first 300secs since boot and with
furtherlowram configurations(<2GB) it is observed as early as first
150secs since boot.

These failures can be avoided by excluding the ->watermark_boost in
watermark caluculations for atomic order-0 allocations.

Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
---
 mm/page_alloc.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Andrew Morton May 20, 2020, 1:40 a.m. UTC | #1
On Tue, 19 May 2020 15:28:04 +0530 Charan Teja Reddy <charante@codeaurora.org> wrote:

> When boosting is enabled, it is observed that rate of atomic order-0
> allocation failures are high due to the fact that free levels in the
> system are checked with ->watermark_boost offset. This is not a problem
> for sleepable allocations but for atomic allocations which looks like
> regression.
> 
> This problem is seen frequently on system setup of Android kernel
> running on Snapdragon hardware with 4GB RAM size. When no extfrag event
> occurred in the system, ->watermark_boost factor is zero, thus the
> watermark configurations in the system are:
>    _watermark = (
>           [WMARK_MIN] = 1272, --> ~5MB
>           [WMARK_LOW] = 9067, --> ~36MB
>           [WMARK_HIGH] = 9385), --> ~38MB
>    watermark_boost = 0
> 
> After launching some memory hungry applications in Android which can
> cause extfrag events in the system to an extent that ->watermark_boost
> can be set to max i.e. default boost factor makes it to 150% of high
> watermark.
>    _watermark = (
>           [WMARK_MIN] = 1272, --> ~5MB
>           [WMARK_LOW] = 9067, --> ~36MB
>           [WMARK_HIGH] = 9385), --> ~38MB
>    watermark_boost = 14077, -->~57MB
> 
> With default system configuration, for an atomic order-0 allocation to
> succeed, having free memory of ~2MB will suffice. But boosting makes
> the min_wmark to ~61MB thus for an atomic order-0 allocation to be
> successful system should have minimum of ~23MB of free memory(from
> calculations of zone_watermark_ok(), min = 3/4(min/2)). But failures are
> observed despite system is having ~20MB of free memory. In the testing,
> this is reproducible as early as first 300secs since boot and with
> furtherlowram configurations(<2GB) it is observed as early as first
> 150secs since boot.
> 
> These failures can be avoided by excluding the ->watermark_boost in
> watermark caluculations for atomic order-0 allocations.

Seems sensible.

> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3709,6 +3709,18 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>  		}
>  
>  		mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
> +		/*
> +		 * Allow GFP_ATOMIC order-0 allocations to exclude the
> +		 * zone->watermark_boost in its watermark calculations.
> +		 * We rely on the ALLOC_ flags set for GFP_ATOMIC
> +		 * requests in gfp_to_alloc_flags() for this. Reason not to
> +		 * use the GFP_ATOMIC directly is that we want to fall back
> +		 * to slow path thus wake up kswapd.

Nice comment, but I don't understand it ;)

Why would testing gfp_mask prevent us from waking kswapd?

> +		 */
> +		if (unlikely(!order && !(alloc_flags & ALLOC_WMARK_MASK) &&
> +		     (alloc_flags & (ALLOC_HARDER | ALLOC_HIGH)))) {
> +			mark = zone->_watermark[WMARK_MIN];
> +		}

Why is this not implemented for higher-order allocation attempts?

>  		if (!zone_watermark_fast(zone, order, mark,
>  				       ac->highest_zoneidx, alloc_flags)) {
>  			int ret;
Charan Teja Kalla May 20, 2020, 4:36 p.m. UTC | #2
Thank you Andrew for the comments..

On 5/20/2020 7:10 AM, Andrew Morton wrote:
> On Tue, 19 May 2020 15:28:04 +0530 Charan Teja Reddy <charante@codeaurora.org> wrote:
> 
>> When boosting is enabled, it is observed that rate of atomic order-0
>> allocation failures are high due to the fact that free levels in the
>> system are checked with ->watermark_boost offset. This is not a problem
>> for sleepable allocations but for atomic allocations which looks like
>> regression.
>>
>> This problem is seen frequently on system setup of Android kernel
>> running on Snapdragon hardware with 4GB RAM size. When no extfrag event
>> occurred in the system, ->watermark_boost factor is zero, thus the
>> watermark configurations in the system are:
>>    _watermark = (
>>           [WMARK_MIN] = 1272, --> ~5MB
>>           [WMARK_LOW] = 9067, --> ~36MB
>>           [WMARK_HIGH] = 9385), --> ~38MB
>>    watermark_boost = 0
>>
>> After launching some memory hungry applications in Android which can
>> cause extfrag events in the system to an extent that ->watermark_boost
>> can be set to max i.e. default boost factor makes it to 150% of high
>> watermark.
>>    _watermark = (
>>           [WMARK_MIN] = 1272, --> ~5MB
>>           [WMARK_LOW] = 9067, --> ~36MB
>>           [WMARK_HIGH] = 9385), --> ~38MB
>>    watermark_boost = 14077, -->~57MB
>>
>> With default system configuration, for an atomic order-0 allocation to
>> succeed, having free memory of ~2MB will suffice. But boosting makes
>> the min_wmark to ~61MB thus for an atomic order-0 allocation to be
>> successful system should have minimum of ~23MB of free memory(from
>> calculations of zone_watermark_ok(), min = 3/4(min/2)). But failures are
>> observed despite system is having ~20MB of free memory. In the testing,
>> this is reproducible as early as first 300secs since boot and with
>> furtherlowram configurations(<2GB) it is observed as early as first
>> 150secs since boot.
>>
>> These failures can be avoided by excluding the ->watermark_boost in
>> watermark caluculations for atomic order-0 allocations.
> 
> Seems sensible.
> 
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3709,6 +3709,18 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>>  		}
>>  
>>  		mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
>> +		/*
>> +		 * Allow GFP_ATOMIC order-0 allocations to exclude the
>> +		 * zone->watermark_boost in its watermark calculations.
>> +		 * We rely on the ALLOC_ flags set for GFP_ATOMIC
>> +		 * requests in gfp_to_alloc_flags() for this. Reason not to
>> +		 * use the GFP_ATOMIC directly is that we want to fall back
>> +		 * to slow path thus wake up kswapd.
> 
> Nice comment, but I don't understand it ;)
> 
> Why would testing gfp_mask prevent us from waking kswapd?

This piece of code is in the common function get_page_from_freelist()
which will be called in the below order:
1) alloc_pages() with alloc_flags = ALLOC_WMARK_LOW. If watermark check
fails here then,
2) Allocation request will fall back to the slow path,
__alloc_pages_slowpath with alloc_flags = ALLOC_WMARK_MIN, **where it
wakes up kswapd**.

If I use the GFP_ATOMIC directly then even if watermarks are boosted
(and kswapd is yet to wake up), then atomic allocation can return
success from 1) with out even waking up kswapd. This doesn't seem correct.

> 
>> +		 */
>> +		if (unlikely(!order && !(alloc_flags & ALLOC_WMARK_MASK) &&
>> +		     (alloc_flags & (ALLOC_HARDER | ALLOC_HIGH)))) {
>> +			mark = zone->_watermark[WMARK_MIN];
>> +		}
> 
> Why is this not implemented for higher-order allocation attempts?

I don't think that higher-order allocation failures are such critical, I
meant that there will be no critical users who can rely on higher-order
atomic allocations to be successful. Please correct me If I am wrong
here. Atleast this is the case on Android systems..

> 
>>  		if (!zone_watermark_fast(zone, order, mark,
>>  				       ac->highest_zoneidx, alloc_flags)) {
>>  			int ret;
>
Andrew Morton June 4, 2020, 9:43 p.m. UTC | #3
On Tue, 19 May 2020 15:28:04 +0530 Charan Teja Reddy <charante@codeaurora.org> wrote:

> When boosting is enabled, it is observed that rate of atomic order-0
> allocation failures are high due to the fact that free levels in the
> system are checked with ->watermark_boost offset. This is not a problem
> for sleepable allocations but for atomic allocations which looks like
> regression.
> 
> This problem is seen frequently on system setup of Android kernel
> running on Snapdragon hardware with 4GB RAM size. When no extfrag event
> occurred in the system, ->watermark_boost factor is zero, thus the
> watermark configurations in the system are:
>    _watermark = (
>           [WMARK_MIN] = 1272, --> ~5MB
>           [WMARK_LOW] = 9067, --> ~36MB
>           [WMARK_HIGH] = 9385), --> ~38MB
>    watermark_boost = 0
> 
> After launching some memory hungry applications in Android which can
> cause extfrag events in the system to an extent that ->watermark_boost
> can be set to max i.e. default boost factor makes it to 150% of high
> watermark.
>    _watermark = (
>           [WMARK_MIN] = 1272, --> ~5MB
>           [WMARK_LOW] = 9067, --> ~36MB
>           [WMARK_HIGH] = 9385), --> ~38MB
>    watermark_boost = 14077, -->~57MB
> 
> With default system configuration, for an atomic order-0 allocation to
> succeed, having free memory of ~2MB will suffice. But boosting makes
> the min_wmark to ~61MB thus for an atomic order-0 allocation to be
> successful system should have minimum of ~23MB of free memory(from
> calculations of zone_watermark_ok(), min = 3/4(min/2)). But failures are
> observed despite system is having ~20MB of free memory. In the testing,
> this is reproducible as early as first 300secs since boot and with
> furtherlowram configurations(<2GB) it is observed as early as first
> 150secs since boot.
> 
> These failures can be avoided by excluding the ->watermark_boost in
> watermark caluculations for atomic order-0 allocations.

Do we have any additional reviewer input on this one?

> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3709,6 +3709,18 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>  		}
>  
>  		mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
> +		/*
> +		 * Allow GFP_ATOMIC order-0 allocations to exclude the
> +		 * zone->watermark_boost in its watermark calculations.
> +		 * We rely on the ALLOC_ flags set for GFP_ATOMIC
> +		 * requests in gfp_to_alloc_flags() for this. Reason not to
> +		 * use the GFP_ATOMIC directly is that we want to fall back
> +		 * to slow path thus wake up kswapd.
> +		 */
> +		if (unlikely(!order && !(alloc_flags & ALLOC_WMARK_MASK) &&
> +		     (alloc_flags & (ALLOC_HARDER | ALLOC_HIGH)))) {
> +			mark = zone->_watermark[WMARK_MIN];
> +		}
>  		if (!zone_watermark_fast(zone, order, mark,
>  				       ac->highest_zoneidx, alloc_flags)) {
>  			int ret;

It would seem smart to do

--- a/mm/page_alloc.c~mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations-fix
+++ a/mm/page_alloc.c
@@ -3745,7 +3745,6 @@ retry:
 			}
 		}
 
-		mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
 		/*
 		 * Allow GFP_ATOMIC order-0 allocations to exclude the
 		 * zone->watermark_boost in their watermark calculations.
@@ -3757,6 +3756,8 @@ retry:
 		if (unlikely(!order && !(alloc_flags & ALLOC_WMARK_MASK) &&
 		     (alloc_flags & (ALLOC_HARDER | ALLOC_HIGH)))) {
 			mark = zone->_watermark[WMARK_MIN];
+		} else {
+			mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
 		}
 		if (!zone_watermark_fast(zone, order, mark,
 				       ac->highest_zoneidx, alloc_flags)) {

but that makes page_alloc.o 16 bytes larger, so I guess don't bother.
Charan Teja Kalla June 9, 2020, 10:59 a.m. UTC | #4
Adding more people to get additional reviewer inputs.

On 6/5/2020 3:13 AM, Andrew Morton wrote:
> On Tue, 19 May 2020 15:28:04 +0530 Charan Teja Reddy <charante@codeaurora.org> wrote:
> 
>> When boosting is enabled, it is observed that rate of atomic order-0
>> allocation failures are high due to the fact that free levels in the
>> system are checked with ->watermark_boost offset. This is not a problem
>> for sleepable allocations but for atomic allocations which looks like
>> regression.
>>
>> This problem is seen frequently on system setup of Android kernel
>> running on Snapdragon hardware with 4GB RAM size. When no extfrag event
>> occurred in the system, ->watermark_boost factor is zero, thus the
>> watermark configurations in the system are:
>>    _watermark = (
>>           [WMARK_MIN] = 1272, --> ~5MB
>>           [WMARK_LOW] = 9067, --> ~36MB
>>           [WMARK_HIGH] = 9385), --> ~38MB
>>    watermark_boost = 0
>>
>> After launching some memory hungry applications in Android which can
>> cause extfrag events in the system to an extent that ->watermark_boost
>> can be set to max i.e. default boost factor makes it to 150% of high
>> watermark.
>>    _watermark = (
>>           [WMARK_MIN] = 1272, --> ~5MB
>>           [WMARK_LOW] = 9067, --> ~36MB
>>           [WMARK_HIGH] = 9385), --> ~38MB
>>    watermark_boost = 14077, -->~57MB
>>
>> With default system configuration, for an atomic order-0 allocation to
>> succeed, having free memory of ~2MB will suffice. But boosting makes
>> the min_wmark to ~61MB thus for an atomic order-0 allocation to be
>> successful system should have minimum of ~23MB of free memory(from
>> calculations of zone_watermark_ok(), min = 3/4(min/2)). But failures are
>> observed despite system is having ~20MB of free memory. In the testing,
>> this is reproducible as early as first 300secs since boot and with
>> furtherlowram configurations(<2GB) it is observed as early as first
>> 150secs since boot.
>>
>> These failures can be avoided by excluding the ->watermark_boost in
>> watermark caluculations for atomic order-0 allocations.
> 
> Do we have any additional reviewer input on this one?
> 
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3709,6 +3709,18 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>>  		}
>>  
>>  		mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
>> +		/*
>> +		 * Allow GFP_ATOMIC order-0 allocations to exclude the
>> +		 * zone->watermark_boost in its watermark calculations.
>> +		 * We rely on the ALLOC_ flags set for GFP_ATOMIC
>> +		 * requests in gfp_to_alloc_flags() for this. Reason not to
>> +		 * use the GFP_ATOMIC directly is that we want to fall back
>> +		 * to slow path thus wake up kswapd.
>> +		 */
>> +		if (unlikely(!order && !(alloc_flags & ALLOC_WMARK_MASK) &&
>> +		     (alloc_flags & (ALLOC_HARDER | ALLOC_HIGH)))) {
>> +			mark = zone->_watermark[WMARK_MIN];
>> +		}
>>  		if (!zone_watermark_fast(zone, order, mark,
>>  				       ac->highest_zoneidx, alloc_flags)) {
>>  			int ret;
> 
> It would seem smart to do
> 
> --- a/mm/page_alloc.c~mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations-fix
> +++ a/mm/page_alloc.c
> @@ -3745,7 +3745,6 @@ retry:
>  			}
>  		}
>  
> -		mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
>  		/*
>  		 * Allow GFP_ATOMIC order-0 allocations to exclude the
>  		 * zone->watermark_boost in their watermark calculations.
> @@ -3757,6 +3756,8 @@ retry:
>  		if (unlikely(!order && !(alloc_flags & ALLOC_WMARK_MASK) &&
>  		     (alloc_flags & (ALLOC_HARDER | ALLOC_HIGH)))) {
>  			mark = zone->_watermark[WMARK_MIN];
> +		} else {
> +			mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
>  		}
>  		if (!zone_watermark_fast(zone, order, mark,
>  				       ac->highest_zoneidx, alloc_flags)) {
> 
> but that makes page_alloc.o 16 bytes larger, so I guess don't bother.
>
Mel Gorman June 9, 2020, 12:28 p.m. UTC | #5
On Tue, May 19, 2020 at 03:28:04PM +0530, Charan Teja Reddy wrote:
> When boosting is enabled, it is observed that rate of atomic order-0
> allocation failures are high due to the fact that free levels in the
> system are checked with ->watermark_boost offset. This is not a problem
> for sleepable allocations but for atomic allocations which looks like
> regression.
> 

Are high-order allocations in general of interest to this platform? If
not then a potential option is to simply disable boosting. The patch is
still relevant but it's worth thinking about.

> This problem is seen frequently on system setup of Android kernel
> running on Snapdragon hardware with 4GB RAM size. When no extfrag event
> occurred in the system, ->watermark_boost factor is zero, thus the
> watermark configurations in the system are:
>    _watermark = (
>           [WMARK_MIN] = 1272, --> ~5MB
>           [WMARK_LOW] = 9067, --> ~36MB
>           [WMARK_HIGH] = 9385), --> ~38MB
>    watermark_boost = 0
> 
> After launching some memory hungry applications in Android which can
> cause extfrag events in the system to an extent that ->watermark_boost
> can be set to max i.e. default boost factor makes it to 150% of high
> watermark.
>    _watermark = (
>           [WMARK_MIN] = 1272, --> ~5MB
>           [WMARK_LOW] = 9067, --> ~36MB
>           [WMARK_HIGH] = 9385), --> ~38MB
>    watermark_boost = 14077, -->~57MB
> 
> With default system configuration, for an atomic order-0 allocation to
> succeed, having free memory of ~2MB will suffice. But boosting makes
> the min_wmark to ~61MB thus for an atomic order-0 allocation to be
> successful system should have minimum of ~23MB of free memory(from
> calculations of zone_watermark_ok(), min = 3/4(min/2)). But failures are
> observed despite system is having ~20MB of free memory. In the testing,
> this is reproducible as early as first 300secs since boot and with
> furtherlowram configurations(<2GB) it is observed as early as first
> 150secs since boot.
> 
> These failures can be avoided by excluding the ->watermark_boost in
> watermark caluculations for atomic order-0 allocations.
> 
> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
> ---
>  mm/page_alloc.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d001d61..5193d7e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3709,6 +3709,18 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>  		}
>  
>  		mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
> +		/*
> +		 * Allow GFP_ATOMIC order-0 allocations to exclude the
> +		 * zone->watermark_boost in its watermark calculations.
> +		 * We rely on the ALLOC_ flags set for GFP_ATOMIC
> +		 * requests in gfp_to_alloc_flags() for this. Reason not to
> +		 * use the GFP_ATOMIC directly is that we want to fall back
> +		 * to slow path thus wake up kswapd.
> +		 */

The comment is a bit difficult to parse. Maybe this.

	/*
	 * Ignore watermark boosting for GFP_ATOMIC order-0 allocations
	 * when checking the min watermark. The min watermark is the
	 * point where boosting is ignored so that kswapd is woken up
	 * when below the low watermark.
	 */

I left out the ALLOC_ part for reasons that are explained blow.

> +		if (unlikely(!order && !(alloc_flags & ALLOC_WMARK_MASK) &&
> +		     (alloc_flags & (ALLOC_HARDER | ALLOC_HIGH)))) {
> +			mark = zone->_watermark[WMARK_MIN];
> +		}

The second check is a bit more obscure than it needs to be and depends
on WMARK_MIN == 0. That will probably be true forever but it's not
obvious at a glance. I suggest something like
((alloc_flags & ALLOC_WMARK_MASK) == WMARK_MIN).

For detecting atomic alloctions, you rely on the either ALLOC_HARDER or
ALLOC_HIGH being set. ALLOC_HIGH can be set for non-atomic allocations
and ALLOC_HARDER can be set for RT tasks. You probably should just test
the gfp_mask because as it stands non-atomic allocations can ignore the
boost too.

Finally, the patch puts an unlikely check into a relatively fast path even
though watermarks may be fine with or without boosting.  Instead you could
put the checks in zone_watermark_fast() if and only if the watermarks
failed the first time. If the checks pass, the watermarks get checked
a second time. This will be fractionally slower for requests failing
watermark checks but there is no penalty for most allocation requests.
It would need the gfp_mask to be passed into zone_watermark_fast but
as it's an inlined function, there should be no cost to passing in the
arguement i.e. do something like this at the end of zone_watermark_fast

	if (__zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags, free_pages))
		return true;

	/* Ignore watermark boosting for .... */
	if (unlikely(!order .....) {
		mark = ...
		return __zone_watermark_ok(...);
	}

	return false;
Charan Teja Kalla June 12, 2020, 11:07 a.m. UTC | #6
Thanks Mel for feedback.

On 6/9/2020 5:58 PM, Mel Gorman wrote:
> On Tue, May 19, 2020 at 03:28:04PM +0530, Charan Teja Reddy wrote:
>> When boosting is enabled, it is observed that rate of atomic order-0
>> allocation failures are high due to the fact that free levels in the
>> system are checked with ->watermark_boost offset. This is not a problem
>> for sleepable allocations but for atomic allocations which looks like
>> regression.
>>
> 
> Are high-order allocations in general of interest to this platform? If
> not then a potential option is to simply disable boosting. The patch is
> still relevant but it's worth thinking about.
> 

Yes we do care till order-3.

>> This problem is seen frequently on system setup of Android kernel
>> running on Snapdragon hardware with 4GB RAM size. When no extfrag event
>> occurred in the system, ->watermark_boost factor is zero, thus the
>> watermark configurations in the system are:
>>    _watermark = (
>>           [WMARK_MIN] = 1272, --> ~5MB
>>           [WMARK_LOW] = 9067, --> ~36MB
>>           [WMARK_HIGH] = 9385), --> ~38MB
>>    watermark_boost = 0
>>
>> After launching some memory hungry applications in Android which can
>> cause extfrag events in the system to an extent that ->watermark_boost
>> can be set to max i.e. default boost factor makes it to 150% of high
>> watermark.
>>    _watermark = (
>>           [WMARK_MIN] = 1272, --> ~5MB
>>           [WMARK_LOW] = 9067, --> ~36MB
>>           [WMARK_HIGH] = 9385), --> ~38MB
>>    watermark_boost = 14077, -->~57MB
>>
>> With default system configuration, for an atomic order-0 allocation to
>> succeed, having free memory of ~2MB will suffice. But boosting makes
>> the min_wmark to ~61MB thus for an atomic order-0 allocation to be
>> successful system should have minimum of ~23MB of free memory(from
>> calculations of zone_watermark_ok(), min = 3/4(min/2)). But failures are
>> observed despite system is having ~20MB of free memory. In the testing,
>> this is reproducible as early as first 300secs since boot and with
>> furtherlowram configurations(<2GB) it is observed as early as first
>> 150secs since boot.
>>
>> These failures can be avoided by excluding the ->watermark_boost in
>> watermark caluculations for atomic order-0 allocations.
>>
>> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
>> ---
>>  mm/page_alloc.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index d001d61..5193d7e 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3709,6 +3709,18 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>>  		}
>>  
>>  		mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
>> +		/*
>> +		 * Allow GFP_ATOMIC order-0 allocations to exclude the
>> +		 * zone->watermark_boost in its watermark calculations.
>> +		 * We rely on the ALLOC_ flags set for GFP_ATOMIC
>> +		 * requests in gfp_to_alloc_flags() for this. Reason not to
>> +		 * use the GFP_ATOMIC directly is that we want to fall back
>> +		 * to slow path thus wake up kswapd.
>> +		 */
> 
> The comment is a bit difficult to parse. Maybe this.
> 
> 	/*
> 	 * Ignore watermark boosting for GFP_ATOMIC order-0 allocations
> 	 * when checking the min watermark. The min watermark is the
> 	 * point where boosting is ignored so that kswapd is woken up
> 	 * when below the low watermark.
> 	 */
> 
> I left out the ALLOC_ part for reasons that are explained blow.
> 
>> +		if (unlikely(!order && !(alloc_flags & ALLOC_WMARK_MASK) &&
>> +		     (alloc_flags & (ALLOC_HARDER | ALLOC_HIGH)))) {
>> +			mark = zone->_watermark[WMARK_MIN];
>> +		}
> 
> The second check is a bit more obscure than it needs to be and depends
> on WMARK_MIN == 0. That will probably be true forever but it's not
> obvious at a glance. I suggest something like
> ((alloc_flags & ALLOC_WMARK_MASK) == WMARK_MIN).
> 
> For detecting atomic alloctions, you rely on the either ALLOC_HARDER or
> ALLOC_HIGH being set. ALLOC_HIGH can be set for non-atomic allocations
> and ALLOC_HARDER can be set for RT tasks. You probably should just test
> the gfp_mask because as it stands non-atomic allocations can ignore the
> boost too.
> 
> Finally, the patch puts an unlikely check into a relatively fast path even
> though watermarks may be fine with or without boosting.  Instead you could
> put the checks in zone_watermark_fast() if and only if the watermarks
> failed the first time. If the checks pass, the watermarks get checked
> a second time. This will be fractionally slower for requests failing
> watermark checks but there is no penalty for most allocation requests.
> It would need the gfp_mask to be passed into zone_watermark_fast but
> as it's an inlined function, there should be no cost to passing in the
> arguement i.e. do something like this at the end of zone_watermark_fast
> 
> 	if (__zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags, free_pages))
> 		return true;
> 
> 	/* Ignore watermark boosting for .... */
> 	if (unlikely(!order .....) {
> 		mark = ...
> 		return __zone_watermark_ok(...);
> 	}
> 
> 	return false;
> 

Incorporated these suggestions at:
https://lore.kernel.org/patchwork/patch/1254998/. Can you please help in
reviewing?
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d001d61..5193d7e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3709,6 +3709,18 @@  static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
 		}
 
 		mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK);
+		/*
+		 * Allow GFP_ATOMIC order-0 allocations to exclude the
+		 * zone->watermark_boost in its watermark calculations.
+		 * We rely on the ALLOC_ flags set for GFP_ATOMIC
+		 * requests in gfp_to_alloc_flags() for this. Reason not to
+		 * use the GFP_ATOMIC directly is that we want to fall back
+		 * to slow path thus wake up kswapd.
+		 */
+		if (unlikely(!order && !(alloc_flags & ALLOC_WMARK_MASK) &&
+		     (alloc_flags & (ALLOC_HARDER | ALLOC_HIGH)))) {
+			mark = zone->_watermark[WMARK_MIN];
+		}
 		if (!zone_watermark_fast(zone, order, mark,
 				       ac->highest_zoneidx, alloc_flags)) {
 			int ret;