diff mbox series

vm_swappiness=0 should still try to avoid swapping anon memory

Message ID 20210806231701.106980-1-npache@redhat.com (mailing list archive)
State New
Headers show
Series vm_swappiness=0 should still try to avoid swapping anon memory | expand

Commit Message

Nico Pache Aug. 6, 2021, 11:17 p.m. UTC
Since commit b91ac374346b ("mm: vmscan: enforce inactive:active ratio at the
reclaim root") swappiness can start prematurely swapping anon memory.
This is due to the assumption that refaulting anon should always allow
the shrinker to target anon memory. Add a check for vm_swappiness being
>0 before indiscriminately targeting Anon.

Signed-off-by: Nico Pache <npache@redhat.com>
---
 mm/vmscan.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Shakeel Butt Aug. 7, 2021, 1 a.m. UTC | #1
On Fri, Aug 6, 2021 at 4:17 PM Nico Pache <npache@redhat.com> wrote:
>
> Since commit b91ac374346b ("mm: vmscan: enforce inactive:active ratio at the
> reclaim root") swappiness can start prematurely swapping anon memory.
> This is due to the assumption that refaulting anon should always allow
> the shrinker to target anon memory. Add a check for vm_swappiness being
> >0 before indiscriminately targeting Anon.

Did you actually observe this behavior?

>
> Signed-off-by: Nico Pache <npache@redhat.com>
> ---
>  mm/vmscan.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 4620df62f0ff..8b932ff72e37 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2909,8 +2909,8 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
>
>                 refaults = lruvec_page_state(target_lruvec,
>                                 WORKINGSET_ACTIVATE_ANON);
> -               if (refaults != target_lruvec->refaults[0] ||
> -                       inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
> +               if (vm_swappiness && (refaults != target_lruvec->refaults[0] ||
> +                       inactive_is_low(target_lruvec, LRU_INACTIVE_ANON)))

If you are really seeing the said behavior then why will this fix it.
This is just about deactivating active anon LRU. I would rather look
at get_scan_count() to check why swappiness = 0 is still letting the
kernel to scan anon LRU. BTW in cgroup v1, the memcg can overwrite
their swappiness which will be preferred over system vm_swappiness.
Did you set system level swappiness or memcg one?

>                         sc->may_deactivate |= DEACTIVATE_ANON;
>                 else
>                         sc->may_deactivate &= ~DEACTIVATE_ANON;
> --
> 2.31.1
>
Nico Pache Aug. 7, 2021, 1:37 a.m. UTC | #2
On 8/6/21 9:00 PM, Shakeel Butt wrote:
> On Fri, Aug 6, 2021 at 4:17 PM Nico Pache <npache@redhat.com> wrote:
>> Since commit b91ac374346b ("mm: vmscan: enforce inactive:active ratio at the
>> reclaim root") swappiness can start prematurely swapping anon memory.
>> This is due to the assumption that refaulting anon should always allow
>> the shrinker to target anon memory. Add a check for vm_swappiness being
>>> 0 before indiscriminately targeting Anon.
> Did you actually observe this behavior?
Yes, and I've successfully tested this patch. It does solve the issue.
>
>> Signed-off-by: Nico Pache <npache@redhat.com>
>> ---
>>  mm/vmscan.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 4620df62f0ff..8b932ff72e37 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -2909,8 +2909,8 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
>>
>>                 refaults = lruvec_page_state(target_lruvec,
>>                                 WORKINGSET_ACTIVATE_ANON);
>> -               if (refaults != target_lruvec->refaults[0] ||
>> -                       inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
>> +               if (vm_swappiness && (refaults != target_lruvec->refaults[0] ||
>> +                       inactive_is_low(target_lruvec, LRU_INACTIVE_ANON)))
> If you are really seeing the said behavior then why will this fix it.
> This is just about deactivating active anon LRU. I would rather look
> at get_scan_count() to check why swappiness = 0 is still letting the
> kernel to scan anon LRU. BTW in cgroup v1, the memcg can overwrite
> their swappiness which will be preferred over system vm_swappiness.
> Did you set system level swappiness or memcg one?

This fixes the issue because shrink_list() uses the may_deactivate field to determine if it should shrink the active list. This is not the only place that can cause the may_deactivate to deactivate anon, but it is the common path of kswapd/balance_pgdat. I can look into a get_scan_count() solution however this line is the ultimate cause of telling scan controller to go for anon so i figured this is the best spot ( stop the problem at the root, not all the way down in the call path). The get_scan_count balance can also be further modified after some shrinking occurs in shrink_lruvec. 

This is only the system level swappiness. As far as cgroups, I will also take a look into that to make sure we can generalize the solution for that as well. I dont think it should be too hard. 


Thanks for the review!

-- Nico


>>                         sc->may_deactivate |= DEACTIVATE_ANON;
>>                 else
>>                         sc->may_deactivate &= ~DEACTIVATE_ANON;
>> --
>> 2.31.1
>>
Shakeel Butt Aug. 7, 2021, 5:23 a.m. UTC | #3
On Fri, Aug 6, 2021 at 6:37 PM Nico Pache <npache@redhat.com> wrote:
>
>
> On 8/6/21 9:00 PM, Shakeel Butt wrote:
[...]
> > If you are really seeing the said behavior then why will this fix it.
> > This is just about deactivating active anon LRU. I would rather look
> > at get_scan_count() to check why swappiness = 0 is still letting the
> > kernel to scan anon LRU. BTW in cgroup v1, the memcg can overwrite
> > their swappiness which will be preferred over system vm_swappiness.
> > Did you set system level swappiness or memcg one?
>
> This fixes the issue because shrink_list() uses the may_deactivate field
> to determine if it should shrink the active list.

First, the shrink_list() will not be called for anon LRU if get_scan_count()
has decided to not scan the anon LRU.

Second, I would like to get your attention to the following comment in
get_scan_count():

"Global reclaim will swap to prevent OOM even with no swappiness"

It seems like the behavior you are seeing is actually working as intended.
You may decide to change that behavior but you will need to motivate the
change.
Michal Hocko Aug. 9, 2021, 11:49 a.m. UTC | #4
On Fri 06-08-21 22:23:48, Shakeel Butt wrote:
> On Fri, Aug 6, 2021 at 6:37 PM Nico Pache <npache@redhat.com> wrote:
> >
> >
> > On 8/6/21 9:00 PM, Shakeel Butt wrote:
> [...]
> > > If you are really seeing the said behavior then why will this fix it.
> > > This is just about deactivating active anon LRU. I would rather look
> > > at get_scan_count() to check why swappiness = 0 is still letting the
> > > kernel to scan anon LRU. BTW in cgroup v1, the memcg can overwrite
> > > their swappiness which will be preferred over system vm_swappiness.
> > > Did you set system level swappiness or memcg one?
> >
> > This fixes the issue because shrink_list() uses the may_deactivate field
> > to determine if it should shrink the active list.
> 
> First, the shrink_list() will not be called for anon LRU if get_scan_count()
> has decided to not scan the anon LRU.
> 
> Second, I would like to get your attention to the following comment in
> get_scan_count():
> 
> "Global reclaim will swap to prevent OOM even with no swappiness"
> 
> It seems like the behavior you are seeing is actually working as intended.
> You may decide to change that behavior but you will need to motivate the
> change.

Yes this is true. Only the memcg has a strict no swapping behavior
historically. I do agree that the patch should go into much more details
about the existing problem though. In this context it would be really
good to explain why trashing over page cache is a better outcome than
swapping out some pages.
Nico Pache Aug. 9, 2021, 10:31 p.m. UTC | #5
> First, the shrink_list() will not be called for anon LRU if get_scan_count()
> has decided to not scan the anon LRU.

get_scan_count() will decide to scan the anon LRU if(sc->is_file_tiny) which is set in shrink_node().

 In shrink_node() the MAY_DEACTIVATE/DEACTIVATE_ANON allows this the be activated.

> Second, I would like to get your attention to the following comment in
> get_scan_count():
>
> "Global reclaim will swap to prevent OOM even with no swappiness"

AFAIK my patchset doesn't prevent any of the OOM cases. It only prevents the anon workingset refaults

from challenging the anon if swappiness=0. 

> It seems like the behavior you are seeing is actually working as intended.
> You may decide to change that behavior but you will need to motivate the
> change.

My V3 has a lot more in the commit log. Hopefully it will clear up my motivation. I will post that now.
diff mbox series

Patch

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 4620df62f0ff..8b932ff72e37 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2909,8 +2909,8 @@  static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 
 		refaults = lruvec_page_state(target_lruvec,
 				WORKINGSET_ACTIVATE_ANON);
-		if (refaults != target_lruvec->refaults[0] ||
-			inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
+		if (vm_swappiness && (refaults != target_lruvec->refaults[0] ||
+			inactive_is_low(target_lruvec, LRU_INACTIVE_ANON)))
 			sc->may_deactivate |= DEACTIVATE_ANON;
 		else
 			sc->may_deactivate &= ~DEACTIVATE_ANON;