diff mbox series

[v2] mm: add lazyfree folio to lru tail

Message ID f29f64e29c08427b95e3df30a5770056@honor.com (mailing list archive)
State New
Headers show
Series [v2] mm: add lazyfree folio to lru tail | expand

Commit Message

gaoxu Aug. 16, 2024, 7:48 a.m. UTC
Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
   moved to the LRU tail, it allows for faster release lazy-free folio and
   reduces the impact on file refault.
2. When mglru is enabled, the lazy-free folio is reclaimabled and should be
   added using lru_gen_add_folio(lruvec, folio, true) instead of
   lru_gen_add_folio(lruvec, folio, false) for adding to gen.

With the change in place, workingset_refault_file is reduced by 33% in the
continuous startup testing of the applications in the Android system.

Signed-off-by: gao xu <gaoxu2@hihonor.com>
---
V1 -> V2: Based on the latest mm-unstable, recreate the patch.

 mm/swap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Hildenbrand Aug. 16, 2024, 10:19 a.m. UTC | #1
On 16.08.24 09:48, gaoxu wrote:
> Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
> 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
>     moved to the LRU tail, it allows for faster release lazy-free folio and
>     reduces the impact on file refault.
> 2. When mglru is enabled, the lazy-free folio is reclaimabled and should be
>     added using lru_gen_add_folio(lruvec, folio, true) instead of
>     lru_gen_add_folio(lruvec, folio, false) for adding to gen.
> 
> With the change in place, workingset_refault_file is reduced by 33% in the
> continuous startup testing of the applications in the Android system.
> 

The patch subject does not match what you do in the patch -- at all.

"mm/swap: use lruvec_add_folio_tail() in lru_lazyfree()" ?
David Hildenbrand Aug. 16, 2024, 10:21 a.m. UTC | #2
On 16.08.24 12:19, David Hildenbrand wrote:
> On 16.08.24 09:48, gaoxu wrote:
>> Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
>> 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
>>      moved to the LRU tail, it allows for faster release lazy-free folio and
>>      reduces the impact on file refault.
>> 2. When mglru is enabled, the lazy-free folio is reclaimabled and should be
>>      added using lru_gen_add_folio(lruvec, folio, true) instead of
>>      lru_gen_add_folio(lruvec, folio, false) for adding to gen.
>>
>> With the change in place, workingset_refault_file is reduced by 33% in the
>> continuous startup testing of the applications in the Android system.
>>
> 
> The patch subject does not match what you do in the patch -- at all.
> 
> "mm/swap: use lruvec_add_folio_tail() in lru_lazyfree()" ?

Ah, sorry, I read it too fast. Yours does make sense :)

(mm/swap might make sense, though)
Suren Baghdasaryan Aug. 19, 2024, 4:11 p.m. UTC | #3
On Fri, Aug 16, 2024 at 3:21 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 16.08.24 12:19, David Hildenbrand wrote:
> > On 16.08.24 09:48, gaoxu wrote:
> >> Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
> >> 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
> >>      moved to the LRU tail, it allows for faster release lazy-free folio and
> >>      reduces the impact on file refault.
> >> 2. When mglru is enabled, the lazy-free folio is reclaimabled and should be
> >>      added using lru_gen_add_folio(lruvec, folio, true) instead of
> >>      lru_gen_add_folio(lruvec, folio, false) for adding to gen.
> >>
> >> With the change in place, workingset_refault_file is reduced by 33% in the
> >> continuous startup testing of the applications in the Android system.

Was this improvement recorded with MGLRU enabled or disabled?
CC'ing Yu Zhao as well.

> >>
> >
> > The patch subject does not match what you do in the patch -- at all.
> >
> > "mm/swap: use lruvec_add_folio_tail() in lru_lazyfree()" ?
>
> Ah, sorry, I read it too fast. Yours does make sense :)
>
> (mm/swap might make sense, though)
>
> --
> Cheers,
>
> David / dhildenb
>
Barry Song Aug. 20, 2024, 10:11 a.m. UTC | #4
On Fri, Aug 16, 2024 at 7:48 PM gaoxu <gaoxu2@honor.com> wrote:
>
> Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
> 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
>    moved to the LRU tail, it allows for faster release lazy-free folio and
>    reduces the impact on file refault.
> 2. When mglru is enabled, the lazy-free folio is reclaimabled and should be
>    added using lru_gen_add_folio(lruvec, folio, true) instead of
>    lru_gen_add_folio(lruvec, folio, false) for adding to gen.
>
> With the change in place, workingset_refault_file is reduced by 33% in the
> continuous startup testing of the applications in the Android system.
>

Hi Gao,

Just curious, in which scenario are we frequently calling
MADV_FREE but not MADV_DONTNEED?

> Signed-off-by: gao xu <gaoxu2@hihonor.com>
> ---
> V1 -> V2: Based on the latest mm-unstable, recreate the patch.
>
>  mm/swap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/swap.c b/mm/swap.c
> index 6b838986d..e0dbfc983 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -641,7 +641,7 @@ static void lru_lazyfree(struct lruvec *lruvec, struct folio *folio)
>          * anonymous folios
>          */
>         folio_clear_swapbacked(folio);
> -       lruvec_add_folio(lruvec, folio);
> +       lruvec_add_folio_tail(lruvec, folio);
>
>         __count_vm_events(PGLAZYFREE, nr_pages);
>         __count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE, nr_pages);
> --
> 2.17.1
gaoxu Aug. 20, 2024, 11:53 a.m. UTC | #5
> 
> On Fri, Aug 16, 2024 at 7:48 PM gaoxu <gaoxu2@honor.com> wrote:
> >
> > Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
> > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
> >    moved to the LRU tail, it allows for faster release lazy-free folio and
> >    reduces the impact on file refault.
> > 2. When mglru is enabled, the lazy-free folio is reclaimabled and should be
> >    added using lru_gen_add_folio(lruvec, folio, true) instead of
> >    lru_gen_add_folio(lruvec, folio, false) for adding to gen.
> >
> > With the change in place, workingset_refault_file is reduced by 33% in
> > the continuous startup testing of the applications in the Android system.
> >
> 
> Hi Gao,
> 
> Just curious, in which scenario are we frequently calling MADV_FREE but not
> MADV_DONTNEED?
Hi Song,
 Android ART use MADV_FREE, please refer to the following link.
 https://android-review.googlesource.com/c/platform/art/+/2633132
> 
> > Signed-off-by: gao xu <gaoxu2@hihonor.com>
> > ---
> > V1 -> V2: Based on the latest mm-unstable, recreate the patch.
> >
> >  mm/swap.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 6b838986d..e0dbfc983 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -641,7 +641,7 @@ static void lru_lazyfree(struct lruvec *lruvec, struct
> folio *folio)
> >          * anonymous folios
> >          */
> >         folio_clear_swapbacked(folio);
> > -       lruvec_add_folio(lruvec, folio);
> > +       lruvec_add_folio_tail(lruvec, folio);
> >
> >         __count_vm_events(PGLAZYFREE, nr_pages);
> >         __count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE,
> > nr_pages);
> > --
> > 2.17.1
Barry Song Aug. 20, 2024, 9:46 p.m. UTC | #6
On Tue, Aug 20, 2024 at 11:54 PM gaoxu <gaoxu2@honor.com> wrote:
>
> >
> > On Fri, Aug 16, 2024 at 7:48 PM gaoxu <gaoxu2@honor.com> wrote:
> > >
> > > Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
> > > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
> > >    moved to the LRU tail, it allows for faster release lazy-free folio and
> > >    reduces the impact on file refault.
> > > 2. When mglru is enabled, the lazy-free folio is reclaimabled and should be
> > >    added using lru_gen_add_folio(lruvec, folio, true) instead of
> > >    lru_gen_add_folio(lruvec, folio, false) for adding to gen.
> > >
> > > With the change in place, workingset_refault_file is reduced by 33% in
> > > the continuous startup testing of the applications in the Android system.
> > >
> >
> > Hi Gao,
> >
> > Just curious, in which scenario are we frequently calling MADV_FREE but not
> > MADV_DONTNEED?
> Hi Song,
>  Android ART use MADV_FREE, please refer to the following link.
>  https://android-review.googlesource.com/c/platform/art/+/2633132

thanks! It seems like the art guys owe us some changelogs, with
no data and no reason :-)

Your change seems reasonable to me. Since users plan to free those
anon folios, we should avoid placing them in hard-to-reclaim areas,
which could lead to unnecessarily reclaiming file folios instead.

> >
> > > Signed-off-by: gao xu <gaoxu2@hihonor.com>
> > > ---
> > > V1 -> V2: Based on the latest mm-unstable, recreate the patch.
> > >
> > >  mm/swap.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/mm/swap.c b/mm/swap.c
> > > index 6b838986d..e0dbfc983 100644
> > > --- a/mm/swap.c
> > > +++ b/mm/swap.c
> > > @@ -641,7 +641,7 @@ static void lru_lazyfree(struct lruvec *lruvec, struct
> > folio *folio)
> > >          * anonymous folios
> > >          */
> > >         folio_clear_swapbacked(folio);
> > > -       lruvec_add_folio(lruvec, folio);
> > > +       lruvec_add_folio_tail(lruvec, folio);
> > >
> > >         __count_vm_events(PGLAZYFREE, nr_pages);
> > >         __count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE,
> > > nr_pages);
> > > --
> > > 2.17.1

Thanks
Barry
Michal Hocko Aug. 21, 2024, 12:46 p.m. UTC | #7
On Fri 16-08-24 07:48:01, gaoxu wrote:
> Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
> 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
>    moved to the LRU tail, it allows for faster release lazy-free folio and
>    reduces the impact on file refault.

This has been discussed when MADV_FREE was introduced. The question was
whether this memory has a lower priority than other inactive memory that
has been marked that way longer ago. Also consider several MADV_FREE
users should they be LIFO from the reclaim POV?
Barry Song Aug. 21, 2024, 9:47 p.m. UTC | #8
On Wed, Aug 21, 2024 at 8:46 PM Michal Hocko <mhocko@suse.com> wrote:
>
> On Fri 16-08-24 07:48:01, gaoxu wrote:
> > Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
> > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
> >    moved to the LRU tail, it allows for faster release lazy-free folio and
> >    reduces the impact on file refault.
>
> This has been discussed when MADV_FREE was introduced. The question was
> whether this memory has a lower priority than other inactive memory that
> has been marked that way longer ago. Also consider several MADV_FREE
> users should they be LIFO from the reclaim POV?

The priority of this memory compared to other inactive memory that has been
marked for a longer time likely depends on the user's expectations - How soon
do users expect MADV_FREE to be reclaimed compared with old file folios.

art guys moved to MADV_FREE from MADV_DONTNEED without any
useful performance data and reason in the changelog:
https://android-review.googlesource.com/c/platform/art/+/2633132

Since art is the Android Java heap, it can be quite large. This increases the
likelihood of packing the file LRU and reduces the chances of reclaiming
anonymous memory, which could result in more file re-faults while helping
anonymous folio persist longer in memory.

I am really curious why art guys have moved to MADV_FREE if we have
an approach to reach them.

>
> --
> Michal Hocko
> SUSE Labs
>

Thanks
Barry
Suren Baghdasaryan Aug. 23, 2024, 11:39 p.m. UTC | #9
On Wed, Aug 21, 2024 at 2:47 PM Barry Song <21cnbao@gmail.com> wrote:
>
> On Wed, Aug 21, 2024 at 8:46 PM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Fri 16-08-24 07:48:01, gaoxu wrote:
> > > Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
> > > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
> > >    moved to the LRU tail, it allows for faster release lazy-free folio and
> > >    reduces the impact on file refault.
> >
> > This has been discussed when MADV_FREE was introduced. The question was
> > whether this memory has a lower priority than other inactive memory that
> > has been marked that way longer ago. Also consider several MADV_FREE
> > users should they be LIFO from the reclaim POV?
>
> The priority of this memory compared to other inactive memory that has been
> marked for a longer time likely depends on the user's expectations - How soon
> do users expect MADV_FREE to be reclaimed compared with old file folios.
>
> art guys moved to MADV_FREE from MADV_DONTNEED without any
> useful performance data and reason in the changelog:
> https://android-review.googlesource.com/c/platform/art/+/2633132
>
> Since art is the Android Java heap, it can be quite large. This increases the
> likelihood of packing the file LRU and reduces the chances of reclaiming
> anonymous memory, which could result in more file re-faults while helping
> anonymous folio persist longer in memory.
>
> I am really curious why art guys have moved to MADV_FREE if we have
> an approach to reach them.

Adding Lokesh.
Lokesh, could you please comment on the reasoning behind the above
mentioned change?

>
> >
> > --
> > Michal Hocko
> > SUSE Labs
> >
>
> Thanks
> Barry
Lokesh Gidra Aug. 26, 2024, 4:37 p.m. UTC | #10
Thanks Suren for looping in

On Fri, Aug 23, 2024 at 4:39 PM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Wed, Aug 21, 2024 at 2:47 PM Barry Song <21cnbao@gmail.com> wrote:
> >
> > On Wed, Aug 21, 2024 at 8:46 PM Michal Hocko <mhocko@suse.com> wrote:
> > >
> > > On Fri 16-08-24 07:48:01, gaoxu wrote:
> > > > Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
> > > > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
> > > >    moved to the LRU tail, it allows for faster release lazy-free folio and
> > > >    reduces the impact on file refault.
> > >
> > > This has been discussed when MADV_FREE was introduced. The question was
> > > whether this memory has a lower priority than other inactive memory that
> > > has been marked that way longer ago. Also consider several MADV_FREE
> > > users should they be LIFO from the reclaim POV?

Thinking from the user's perspective, it seems to me that FIFO within
MADV_FREE'ed pages makes more sense. As a user I expect the longer a
MADV_FREE'ed page hasn't been touched, the chances are higher that it
may not be around anymore.
> >
> > The priority of this memory compared to other inactive memory that has been
> > marked for a longer time likely depends on the user's expectations - How soon
> > do users expect MADV_FREE to be reclaimed compared with old file folios.
> >
> > art guys moved to MADV_FREE from MADV_DONTNEED without any
> > useful performance data and reason in the changelog:
> > https://android-review.googlesource.com/c/platform/art/+/2633132
> >
> > Since art is the Android Java heap, it can be quite large. This increases the
> > likelihood of packing the file LRU and reduces the chances of reclaiming
> > anonymous memory, which could result in more file re-faults while helping
> > anonymous folio persist longer in memory.

Individual heaps of android apps are not big, and even in there we
don't call MADV_FREE on the entire heap.
> >
> > I am really curious why art guys have moved to MADV_FREE if we have
> > an approach to reach them.

Honestly, it makes little sense as a user that calling MADV_FREE on an
anonymous mapping will impact file LRU. That was never the intention
with our ART change.

From our perspective, once a set of pages are MADV_FREE'ed, they are
like a page-cache. It gives an opportunity, without hurting memory
use, to avoid overhead of page-faults, which happen frequently after
GC is done on running apps.

IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
prioritized for reclamation over file ones.
>
> Adding Lokesh.
> Lokesh, could you please comment on the reasoning behind the above
> mentioned change?

Adding Nicolas as well, in case he wants to add something.
>
> >
> > >
> > > --
> > > Michal Hocko
> > > SUSE Labs
> > >
> >
> > Thanks
> > Barry
Barry Song Aug. 26, 2024, 7:54 p.m. UTC | #11
On Tue, Aug 27, 2024 at 4:37 AM Lokesh Gidra <lokeshgidra@google.com> wrote:
>
> Thanks Suren for looping in
>
> On Fri, Aug 23, 2024 at 4:39 PM Suren Baghdasaryan <surenb@google.com> wrote:
> >
> > On Wed, Aug 21, 2024 at 2:47 PM Barry Song <21cnbao@gmail.com> wrote:
> > >
> > > On Wed, Aug 21, 2024 at 8:46 PM Michal Hocko <mhocko@suse.com> wrote:
> > > >
> > > > On Fri 16-08-24 07:48:01, gaoxu wrote:
> > > > > Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
> > > > > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
> > > > >    moved to the LRU tail, it allows for faster release lazy-free folio and
> > > > >    reduces the impact on file refault.
> > > >
> > > > This has been discussed when MADV_FREE was introduced. The question was
> > > > whether this memory has a lower priority than other inactive memory that
> > > > has been marked that way longer ago. Also consider several MADV_FREE
> > > > users should they be LIFO from the reclaim POV?
>
> Thinking from the user's perspective, it seems to me that FIFO within
> MADV_FREE'ed pages makes more sense. As a user I expect the longer a
> MADV_FREE'ed page hasn't been touched, the chances are higher that it
> may not be around anymore.
> > >

Hi Lokesh,
Thanks!

> > > The priority of this memory compared to other inactive memory that has been
> > > marked for a longer time likely depends on the user's expectations - How soon
> > > do users expect MADV_FREE to be reclaimed compared with old file folios.
> > >
> > > art guys moved to MADV_FREE from MADV_DONTNEED without any
> > > useful performance data and reason in the changelog:
> > > https://android-review.googlesource.com/c/platform/art/+/2633132
> > >
> > > Since art is the Android Java heap, it can be quite large. This increases the
> > > likelihood of packing the file LRU and reduces the chances of reclaiming
> > > anonymous memory, which could result in more file re-faults while helping
> > > anonymous folio persist longer in memory.
>
> Individual heaps of android apps are not big, and even in there we
> don't call MADV_FREE on the entire heap.

How do you define "Individual heaps of android apps", do you know the usual
total_size for a phone with memory pressure by running multiple apps and how
much for each app?

> > >
> > > I am really curious why art guys have moved to MADV_FREE if we have
> > > an approach to reach them.
>
> Honestly, it makes little sense as a user that calling MADV_FREE on an
> anonymous mapping will impact file LRU. That was never the intention
> with our ART change.
>

This is just how MADV_FREE is implemented in the kernel, this kind of lazyfree
anon folios are moved to file but *NOT* anon LRU.

> From our perspective, once a set of pages are MADV_FREE'ed, they are
> like a page-cache. It gives an opportunity, without hurting memory
> use, to avoid overhead of page-faults, which happen frequently after
> GC is done on running apps.
>
> IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
> prioritized for reclamation over file ones.

This is exactly what this patch is doing, putting lazyfree anon folios
to the tail of file LRU so that they can be reclaimed earlier than file
folios. But the question is: is the requirement "MADV_FREE'ed pages
should be prioritized for reclamation over file ones" universally true for
all other non-Android users?

> >
> > Adding Lokesh.
> > Lokesh, could you please comment on the reasoning behind the above
> > mentioned change?
>
> Adding Nicolas as well, in case he wants to add something.
> >
> > >
> > > >
> > > > --
> > > > Michal Hocko
> > > > SUSE Labs
> > > >
> > >

Thanks
Barry
Lokesh Gidra Aug. 27, 2024, 12:12 a.m. UTC | #12
On Mon, Aug 26, 2024 at 12:55 PM Barry Song <21cnbao@gmail.com> wrote:
>
> On Tue, Aug 27, 2024 at 4:37 AM Lokesh Gidra <lokeshgidra@google.com> wrote:
> >
> > Thanks Suren for looping in
> >
> > On Fri, Aug 23, 2024 at 4:39 PM Suren Baghdasaryan <surenb@google.com> wrote:
> > >
> > > On Wed, Aug 21, 2024 at 2:47 PM Barry Song <21cnbao@gmail.com> wrote:
> > > >
> > > > On Wed, Aug 21, 2024 at 8:46 PM Michal Hocko <mhocko@suse.com> wrote:
> > > > >
> > > > > On Fri 16-08-24 07:48:01, gaoxu wrote:
> > > > > > Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
> > > > > > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
> > > > > >    moved to the LRU tail, it allows for faster release lazy-free folio and
> > > > > >    reduces the impact on file refault.
> > > > >
> > > > > This has been discussed when MADV_FREE was introduced. The question was
> > > > > whether this memory has a lower priority than other inactive memory that
> > > > > has been marked that way longer ago. Also consider several MADV_FREE
> > > > > users should they be LIFO from the reclaim POV?
> >
> > Thinking from the user's perspective, it seems to me that FIFO within
> > MADV_FREE'ed pages makes more sense. As a user I expect the longer a
> > MADV_FREE'ed page hasn't been touched, the chances are higher that it
> > may not be around anymore.
> > > >
>
> Hi Lokesh,
> Thanks!
>
> > > > The priority of this memory compared to other inactive memory that has been
> > > > marked for a longer time likely depends on the user's expectations - How soon
> > > > do users expect MADV_FREE to be reclaimed compared with old file folios.
> > > >
> > > > art guys moved to MADV_FREE from MADV_DONTNEED without any
> > > > useful performance data and reason in the changelog:
> > > > https://android-review.googlesource.com/c/platform/art/+/2633132
> > > >
> > > > Since art is the Android Java heap, it can be quite large. This increases the
> > > > likelihood of packing the file LRU and reduces the chances of reclaiming
> > > > anonymous memory, which could result in more file re-faults while helping
> > > > anonymous folio persist longer in memory.
> >
> > Individual heaps of android apps are not big, and even in there we
> > don't call MADV_FREE on the entire heap.
>
> How do you define "Individual heaps of android apps", do you know the usual
> total_size for a phone with memory pressure by running multiple apps and how
> much for each app?
>
Every app is a separate process and therefore has its own private ART
heap. Those numbers that you are asking vary drastically. But here's
what I can tell you:

Max heap size for an app is 512MB typically. But it is rarely entirely
used. Typical heap usage is 50MB to 250MB. But as I said, not all of
it is MADV_FREE'ed. Only those pages which are freed after GC
compaction are.
> > > >
> > > > I am really curious why art guys have moved to MADV_FREE if we have
> > > > an approach to reach them.
> >
> > Honestly, it makes little sense as a user that calling MADV_FREE on an
> > anonymous mapping will impact file LRU. That was never the intention
> > with our ART change.
> >
>
> This is just how MADV_FREE is implemented in the kernel, this kind of lazyfree
> anon folios are moved to file but *NOT* anon LRU.
>
> > From our perspective, once a set of pages are MADV_FREE'ed, they are
> > like a page-cache. It gives an opportunity, without hurting memory
> > use, to avoid overhead of page-faults, which happen frequently after
> > GC is done on running apps.
> >
> > IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
> > prioritized for reclamation over file ones.
>
> This is exactly what this patch is doing, putting lazyfree anon folios
> to the tail of file LRU so that they can be reclaimed earlier than file
> folios. But the question is: is the requirement "MADV_FREE'ed pages
> should be prioritized for reclamation over file ones" universally true for
> all other non-Android users?
>
That's definitely an important question to get answered. But putting
my users hat on again, by explicitly MADV_FREE'ing we ask for that
behavior. IMHO, MADV_FREE'ed pages should be the first ones to be
reclaimed on memory pressure.
> > >
> > > Adding Lokesh.
> > > Lokesh, could you please comment on the reasoning behind the above
> > > mentioned change?
> >
> > Adding Nicolas as well, in case he wants to add something.
> > >
> > > >
> > > > >
> > > > > --
> > > > > Michal Hocko
> > > > > SUSE Labs
> > > > >
> > > >
>
> Thanks
> Barry
Hailong Liu Aug. 27, 2024, 2:13 a.m. UTC | #13
On Mon, 26. Aug 09:37, Lokesh Gidra wrote:
>
> IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
> prioritized for reclamation over file ones.


> >
> > Adding Lokesh.
> > Lokesh, could you please comment on the reasoning behind the above
> > mentioned change?
>
> Adding Nicolas as well, in case he wants to add something.
IMHO, lruvec_add_folio is enough. if lruvec_add_folio_tail why not use
MADV_DONTNEED instead? In MM the reclaim policy prefer to reclaim file cache, if
MADV_FREE'd pages directly add to the tail, they might be reclaimed instantly.
Also the benefit of workingset_refault_file cannot be convinced for me.

So we should know the reasons and the benefits of the changes. page faults or ?
> >
> > >
> > > >
> > > > --
> > > > Michal Hocko
> > > > SUSE Labs
> > > >
> > >
> > > Thanks
> > > Barry
>

--

Help you, Help me,
Hailong.
Barry Song Aug. 27, 2024, 2:18 a.m. UTC | #14
On Tue, Aug 27, 2024 at 2:13 PM Hailong Liu <hailong.liu@oppo.com> wrote:
>
> On Mon, 26. Aug 09:37, Lokesh Gidra wrote:
> >
> > IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
> > prioritized for reclamation over file ones.
>
>
> > >
> > > Adding Lokesh.
> > > Lokesh, could you please comment on the reasoning behind the above
> > > mentioned change?
> >
> > Adding Nicolas as well, in case he wants to add something.
> IMHO, lruvec_add_folio is enough. if lruvec_add_folio_tail why not use
> MADV_DONTNEED instead? In MM the reclaim policy prefer to reclaim file cache, if
> MADV_FREE'd pages directly add to the tail, they might be reclaimed instantly.
> Also the benefit of workingset_refault_file cannot be convinced for me.

My understanding is that MADV_DONTNEED will immediately free the memory,
whereas MADV_FREE will release memory only under memory pressure. If
memory pressure is low, the anonymous memory may still be gotten back
without causing page faults. This might be what Lokesh is aiming to achieve.

>
> So we should know the reasons and the benefits of the changes. page faults or ?
> > >
> > > >
> > > > >
> > > > > --
> > > > > Michal Hocko
> > > > > SUSE Labs
> > > > >
> > > >

Thanks
Barry
Barry Song Aug. 27, 2024, 2:21 a.m. UTC | #15
On Tue, Aug 27, 2024 at 12:12 PM Lokesh Gidra <lokeshgidra@google.com> wrote:
>
> On Mon, Aug 26, 2024 at 12:55 PM Barry Song <21cnbao@gmail.com> wrote:
> >
> > On Tue, Aug 27, 2024 at 4:37 AM Lokesh Gidra <lokeshgidra@google.com> wrote:
> > >
> > > Thanks Suren for looping in
> > >
> > > On Fri, Aug 23, 2024 at 4:39 PM Suren Baghdasaryan <surenb@google.com> wrote:
> > > >
> > > > On Wed, Aug 21, 2024 at 2:47 PM Barry Song <21cnbao@gmail.com> wrote:
> > > > >
> > > > > On Wed, Aug 21, 2024 at 8:46 PM Michal Hocko <mhocko@suse.com> wrote:
> > > > > >
> > > > > > On Fri 16-08-24 07:48:01, gaoxu wrote:
> > > > > > > Replace lruvec_add_folio with lruvec_add_folio_tail in the lru_lazyfree_fn:
> > > > > > > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
> > > > > > >    moved to the LRU tail, it allows for faster release lazy-free folio and
> > > > > > >    reduces the impact on file refault.
> > > > > >
> > > > > > This has been discussed when MADV_FREE was introduced. The question was
> > > > > > whether this memory has a lower priority than other inactive memory that
> > > > > > has been marked that way longer ago. Also consider several MADV_FREE
> > > > > > users should they be LIFO from the reclaim POV?
> > >
> > > Thinking from the user's perspective, it seems to me that FIFO within
> > > MADV_FREE'ed pages makes more sense. As a user I expect the longer a
> > > MADV_FREE'ed page hasn't been touched, the chances are higher that it
> > > may not be around anymore.
> > > > >
> >
> > Hi Lokesh,
> > Thanks!
> >
> > > > > The priority of this memory compared to other inactive memory that has been
> > > > > marked for a longer time likely depends on the user's expectations - How soon
> > > > > do users expect MADV_FREE to be reclaimed compared with old file folios.
> > > > >
> > > > > art guys moved to MADV_FREE from MADV_DONTNEED without any
> > > > > useful performance data and reason in the changelog:
> > > > > https://android-review.googlesource.com/c/platform/art/+/2633132
> > > > >
> > > > > Since art is the Android Java heap, it can be quite large. This increases the
> > > > > likelihood of packing the file LRU and reduces the chances of reclaiming
> > > > > anonymous memory, which could result in more file re-faults while helping
> > > > > anonymous folio persist longer in memory.
> > >
> > > Individual heaps of android apps are not big, and even in there we
> > > don't call MADV_FREE on the entire heap.
> >
> > How do you define "Individual heaps of android apps", do you know the usual
> > total_size for a phone with memory pressure by running multiple apps and how
> > much for each app?
> >
> Every app is a separate process and therefore has its own private ART
> heap. Those numbers that you are asking vary drastically. But here's
> what I can tell you:
>
> Max heap size for an app is 512MB typically. But it is rarely entirely

On my phone, I am seeing a VMA named "dalvik-main space", its
virtual address is 0x14000000-0x34000000 and its size is 0x20000000
(512MB). I guess this is exactly the ART heap we are talking about?

> used. Typical heap usage is 50MB to 250MB. But as I said, not all of

Thank you! Considering we might have dozens of apps running in the
background and foreground, will the total size of the ART heaps on a
phone still be large?

> it is MADV_FREE'ed. Only those pages which are freed after GC
> compaction are.

Are you saying that some memory in the ART heap is marked with
MADV_DONTNEED instead of MADV_FREE? If so, when did this happen?
Alternatively, is my understanding incorrect and you are referring to memory
that is neither MADV_DONTNEED nor MADV_FREE?

> > > > >
> > > > > I am really curious why art guys have moved to MADV_FREE if we have
> > > > > an approach to reach them.
> > >
> > > Honestly, it makes little sense as a user that calling MADV_FREE on an
> > > anonymous mapping will impact file LRU. That was never the intention
> > > with our ART change.
> > >
> >
> > This is just how MADV_FREE is implemented in the kernel, this kind of lazyfree
> > anon folios are moved to file but *NOT* anon LRU.
> >
> > > From our perspective, once a set of pages are MADV_FREE'ed, they are
> > > like a page-cache. It gives an opportunity, without hurting memory
> > > use, to avoid overhead of page-faults, which happen frequently after
> > > GC is done on running apps.
> > >
> > > IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
> > > prioritized for reclamation over file ones.
> >
> > This is exactly what this patch is doing, putting lazyfree anon folios
> > to the tail of file LRU so that they can be reclaimed earlier than file
> > folios. But the question is: is the requirement "MADV_FREE'ed pages
> > should be prioritized for reclamation over file ones" universally true for
> > all other non-Android users?
> >
> That's definitely an important question to get answered. But putting
> my users hat on again, by explicitly MADV_FREE'ing we ask for that
> behavior. IMHO, MADV_FREE'ed pages should be the first ones to be
> reclaimed on memory pressure.

Thanks for clarification! I'd also like to collect some performance data with
this patch.

> > > >
> > > > Adding Lokesh.
> > > > Lokesh, could you please comment on the reasoning behind the above
> > > > mentioned change?
> > >
> > > Adding Nicolas as well, in case he wants to add something.
> > > >
> > > > >
> > > > > >
> > > > > > --
> > > > > > Michal Hocko
> > > > > > SUSE Labs
> > > > > >
> > > > >
> >

Thanks
Barry
Hailong Liu Aug. 27, 2024, 2:29 a.m. UTC | #16
On Tue, 27. Aug 14:18, Barry Song wrote:
> On Tue, Aug 27, 2024 at 2:13 PM Hailong Liu <hailong.liu@oppo.com> wrote:
> >
> > On Mon, 26. Aug 09:37, Lokesh Gidra wrote:
> > >
> > > IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
> > > prioritized for reclamation over file ones.
> >
> >
> > > >
> > > > Adding Lokesh.
> > > > Lokesh, could you please comment on the reasoning behind the above
> > > > mentioned change?
> > >
> > > Adding Nicolas as well, in case he wants to add something.
> > IMHO, lruvec_add_folio is enough. if lruvec_add_folio_tail why not use
> > MADV_DONTNEED instead? In MM the reclaim policy prefer to reclaim file cache, if
> > MADV_FREE'd pages directly add to the tail, they might be reclaimed instantly.
> > Also the benefit of workingset_refault_file cannot be convinced for me.
>
> My understanding is that MADV_DONTNEED will immediately free the memory,
> whereas MADV_FREE will release memory only under memory pressure. If
> memory pressure is low, the anonymous memory may still be gotten back
> without causing page faults. This might be what Lokesh is aiming to achieve.
>
Hmm, IIUC, for the reason of watermark, the kswapd would do reclamation without
memory pressure. I worried here is that the madv_free'd pages reclaimed too fast
if add to tail.
> >
> > So we should know the reasons and the benefits of the changes. page faults or ?
> > > >
> > > > >
> > > > > >
> > > > > > --
> > > > > > Michal Hocko
> > > > > > SUSE Labs
> > > > > >
> > > > >
>
> Thanks
> Barry

--

Help you, Help me,
Hailong.
gaoxu Aug. 27, 2024, 4:07 a.m. UTC | #17
> -----邮件原件-----
> 发件人: Lokesh Gidra <lokeshgidra@google.com>
> 发送时间: 2024年8月27日 8:12
> 收件人: Barry Song <21cnbao@gmail.com>
> 抄送: Suren Baghdasaryan <surenb@google.com>; Nicolas Geoffray
> <ngeoffray@google.com>; Michal Hocko <mhocko@suse.com>; gaoxu
> <gaoxu2@honor.com>; Andrew Morton <akpm@linux-foundation.org>;
> linux-mm@kvack.org; linux-kernel@vger.kernel.org; Shaohua Li <shli@fb.com>;
> yipengxiang <yipengxiang@honor.com>; fengbaopeng
> <fengbaopeng@honor.com>; Kalesh Singh <kaleshsingh@google.com>
> 主题: Re: [PATCH v2] mm: add lazyfree folio to lru tail
> 
> On Mon, Aug 26, 2024 at 12:55 PM Barry Song <21cnbao@gmail.com> wrote:
> >
> > On Tue, Aug 27, 2024 at 4:37 AM Lokesh Gidra <lokeshgidra@google.com>
> wrote:
> > >
> > > Thanks Suren for looping in
> > >
> > > On Fri, Aug 23, 2024 at 4:39 PM Suren Baghdasaryan <surenb@google.com>
> wrote:
> > > >
> > > > On Wed, Aug 21, 2024 at 2:47 PM Barry Song <21cnbao@gmail.com>
> wrote:
> > > > >
> > > > > On Wed, Aug 21, 2024 at 8:46 PM Michal Hocko <mhocko@suse.com>
> wrote:
> > > > > >
> > > > > > On Fri 16-08-24 07:48:01, gaoxu wrote:
> > > > > > > Replace lruvec_add_folio with lruvec_add_folio_tail in the
> lru_lazyfree_fn:
> > > > > > > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
> > > > > > >    moved to the LRU tail, it allows for faster release lazy-free folio
> and
> > > > > > >    reduces the impact on file refault.
> > > > > >
> > > > > > This has been discussed when MADV_FREE was introduced. The
> question was
> > > > > > whether this memory has a lower priority than other inactive memory
> that
> > > > > > has been marked that way longer ago. Also consider several
> MADV_FREE
> > > > > > users should they be LIFO from the reclaim POV?
> > >
> > > Thinking from the user's perspective, it seems to me that FIFO within
> > > MADV_FREE'ed pages makes more sense. As a user I expect the longer a
> > > MADV_FREE'ed page hasn't been touched, the chances are higher that it
> > > may not be around anymore.
> > > > >
> >
> > Hi Lokesh,
> > Thanks!
> >
> > > > > The priority of this memory compared to other inactive memory that has
> been
> > > > > marked for a longer time likely depends on the user's expectations - How
> soon
> > > > > do users expect MADV_FREE to be reclaimed compared with old file
> folios.
> > > > >
> > > > > art guys moved to MADV_FREE from MADV_DONTNEED without any
> > > > > useful performance data and reason in the changelog:
> > > > > https://android-review.googlesource.com/c/platform/art/+/2633132
> > > > >
> > > > > Since art is the Android Java heap, it can be quite large. This increases the
> > > > > likelihood of packing the file LRU and reduces the chances of reclaiming
> > > > > anonymous memory, which could result in more file re-faults while
> helping
> > > > > anonymous folio persist longer in memory.
> > >
> > > Individual heaps of android apps are not big, and even in there we
> > > don't call MADV_FREE on the entire heap.
> >
> > How do you define "Individual heaps of android apps", do you know the usual
> > total_size for a phone with memory pressure by running multiple apps and
> how
> > much for each app?
> >
> Every app is a separate process and therefore has its own private ART
> heap. Those numbers that you are asking vary drastically. But here's
> what I can tell you:
> 
> Max heap size for an app is 512MB typically. But it is rarely entirely
> used. Typical heap usage is 50MB to 250MB. But as I said, not all of
> it is MADV_FREE'ed. Only those pages which are freed after GC
> compaction are.
> > > > >
> > > > > I am really curious why art guys have moved to MADV_FREE if we have
> > > > > an approach to reach them.
> > >
> > > Honestly, it makes little sense as a user that calling MADV_FREE on an
> > > anonymous mapping will impact file LRU. That was never the intention
> > > with our ART change.
> > >
> >
> > This is just how MADV_FREE is implemented in the kernel, this kind of lazyfree
> > anon folios are moved to file but *NOT* anon LRU.
> >
> > > From our perspective, once a set of pages are MADV_FREE'ed, they are
> > > like a page-cache. It gives an opportunity, without hurting memory
> > > use, to avoid overhead of page-faults, which happen frequently after
> > > GC is done on running apps.
> > >
> > > IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
> > > prioritized for reclamation over file ones.
> >
> > This is exactly what this patch is doing, putting lazyfree anon folios
> > to the tail of file LRU so that they can be reclaimed earlier than file
> > folios. But the question is: is the requirement "MADV_FREE'ed pages
> > should be prioritized for reclamation over file ones" universally true for
> > all other non-Android users?
> >
> That's definitely an important question to get answered. But putting
> my users hat on again, by explicitly MADV_FREE'ing we ask for that
> behavior. IMHO, MADV_FREE'ed pages should be the first ones to be
> reclaimed on memory pressure.
For non-Android systems, perhaps the author of MADV_FREE can provide a more
reasonable opinion;
 
Add Minchan Kim.
Please forgive me for forgetting to add you when sending the patch.
> > > >
> > > > Adding Lokesh.
> > > > Lokesh, could you please comment on the reasoning behind the above
> > > > mentioned change?
> > >
> > > Adding Nicolas as well, in case he wants to add something.
> > > >
> > > > >
> > > > > >
> > > > > > --
> > > > > > Michal Hocko
> > > > > > SUSE Labs
> > > > > >
> > > > >
> >
> > Thanks
> > Barry
Minchan Kim Aug. 27, 2024, 5:56 p.m. UTC | #18
On Tue, Aug 27, 2024 at 04:07:57AM +0000, gaoxu wrote:
> 
> 
> > -----邮件原件-----
> > 发件人: Lokesh Gidra <lokeshgidra@google.com>
> > 发送时间: 2024年8月27日 8:12
> > 收件人: Barry Song <21cnbao@gmail.com>
> > 抄送: Suren Baghdasaryan <surenb@google.com>; Nicolas Geoffray
> > <ngeoffray@google.com>; Michal Hocko <mhocko@suse.com>; gaoxu
> > <gaoxu2@honor.com>; Andrew Morton <akpm@linux-foundation.org>;
> > linux-mm@kvack.org; linux-kernel@vger.kernel.org; Shaohua Li <shli@fb.com>;
> > yipengxiang <yipengxiang@honor.com>; fengbaopeng
> > <fengbaopeng@honor.com>; Kalesh Singh <kaleshsingh@google.com>
> > 主题: Re: [PATCH v2] mm: add lazyfree folio to lru tail
> > 
> > On Mon, Aug 26, 2024 at 12:55 PM Barry Song <21cnbao@gmail.com> wrote:
> > >
> > > On Tue, Aug 27, 2024 at 4:37 AM Lokesh Gidra <lokeshgidra@google.com>
> > wrote:
> > > >
> > > > Thanks Suren for looping in
> > > >
> > > > On Fri, Aug 23, 2024 at 4:39 PM Suren Baghdasaryan <surenb@google.com>
> > wrote:
> > > > >
> > > > > On Wed, Aug 21, 2024 at 2:47 PM Barry Song <21cnbao@gmail.com>
> > wrote:
> > > > > >
> > > > > > On Wed, Aug 21, 2024 at 8:46 PM Michal Hocko <mhocko@suse.com>
> > wrote:
> > > > > > >
> > > > > > > On Fri 16-08-24 07:48:01, gaoxu wrote:
> > > > > > > > Replace lruvec_add_folio with lruvec_add_folio_tail in the
> > lru_lazyfree_fn:
> > > > > > > > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's
> > > > > > > >    moved to the LRU tail, it allows for faster release lazy-free folio
> > and
> > > > > > > >    reduces the impact on file refault.
> > > > > > >
> > > > > > > This has been discussed when MADV_FREE was introduced. The
> > question was
> > > > > > > whether this memory has a lower priority than other inactive memory
> > that
> > > > > > > has been marked that way longer ago. Also consider several
> > MADV_FREE
> > > > > > > users should they be LIFO from the reclaim POV?
> > > >
> > > > Thinking from the user's perspective, it seems to me that FIFO within
> > > > MADV_FREE'ed pages makes more sense. As a user I expect the longer a
> > > > MADV_FREE'ed page hasn't been touched, the chances are higher that it
> > > > may not be around anymore.
> > > > > >
> > >
> > > Hi Lokesh,
> > > Thanks!
> > >
> > > > > > The priority of this memory compared to other inactive memory that has
> > been
> > > > > > marked for a longer time likely depends on the user's expectations - How
> > soon
> > > > > > do users expect MADV_FREE to be reclaimed compared with old file
> > folios.
> > > > > >
> > > > > > art guys moved to MADV_FREE from MADV_DONTNEED without any
> > > > > > useful performance data and reason in the changelog:
> > > > > > https://android-review.googlesource.com/c/platform/art/+/2633132
> > > > > >
> > > > > > Since art is the Android Java heap, it can be quite large. This increases the
> > > > > > likelihood of packing the file LRU and reduces the chances of reclaiming
> > > > > > anonymous memory, which could result in more file re-faults while
> > helping
> > > > > > anonymous folio persist longer in memory.
> > > >
> > > > Individual heaps of android apps are not big, and even in there we
> > > > don't call MADV_FREE on the entire heap.
> > >
> > > How do you define "Individual heaps of android apps", do you know the usual
> > > total_size for a phone with memory pressure by running multiple apps and
> > how
> > > much for each app?
> > >
> > Every app is a separate process and therefore has its own private ART
> > heap. Those numbers that you are asking vary drastically. But here's
> > what I can tell you:
> > 
> > Max heap size for an app is 512MB typically. But it is rarely entirely
> > used. Typical heap usage is 50MB to 250MB. But as I said, not all of
> > it is MADV_FREE'ed. Only those pages which are freed after GC
> > compaction are.
> > > > > >
> > > > > > I am really curious why art guys have moved to MADV_FREE if we have
> > > > > > an approach to reach them.
> > > >
> > > > Honestly, it makes little sense as a user that calling MADV_FREE on an
> > > > anonymous mapping will impact file LRU. That was never the intention
> > > > with our ART change.
> > > >
> > >
> > > This is just how MADV_FREE is implemented in the kernel, this kind of lazyfree
> > > anon folios are moved to file but *NOT* anon LRU.
> > >
> > > > From our perspective, once a set of pages are MADV_FREE'ed, they are
> > > > like a page-cache. It gives an opportunity, without hurting memory
> > > > use, to avoid overhead of page-faults, which happen frequently after
> > > > GC is done on running apps.
> > > >
> > > > IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
> > > > prioritized for reclamation over file ones.
> > >
> > > This is exactly what this patch is doing, putting lazyfree anon folios
> > > to the tail of file LRU so that they can be reclaimed earlier than file
> > > folios. But the question is: is the requirement "MADV_FREE'ed pages
> > > should be prioritized for reclamation over file ones" universally true for
> > > all other non-Android users?
> > >
> > That's definitely an important question to get answered. But putting
> > my users hat on again, by explicitly MADV_FREE'ing we ask for that
> > behavior. IMHO, MADV_FREE'ed pages should be the first ones to be
> > reclaimed on memory pressure.
> For non-Android systems, perhaps the author of MADV_FREE can provide a more
> reasonable opinion;
>  
> Add Minchan Kim.
> Please forgive me for forgetting to add you when sending the patch.

AFAIR, there were two concerns:

1. The file LRU would contain pages used only once.

While MADV_FREE allows discarding pages under memory pressure, the system would
still have non-working set pages within the file LRU (e.g., those used only once).


2. LRU inversion among MADV_FREE users.

Consider this time order:

1. A process: MADV_FREE
2. B process: MADV_FREE
3. C process: MADV_FREE

The moving tail approach would discard the most recent pages from Process C first,
instead of those from Process A.

Of course, this isn't universally true for all workloads, but it's the reality.

At the time, I proposed introducing an additional "ez_reclaimable" LRU list to store MADV_FREE pages
(and potentially other hinted pages in the future).
This would allow differentiating priority among LRU lists based on knobs or heuristics.
However, this idea wasn't well-received.
gaoxu Aug. 29, 2024, 3:55 a.m. UTC | #19
> On Tue, Aug 27, 2024 at 04:07:57AM +0000, gaoxu wrote:
> > >
> > > On Mon, Aug 26, 2024 at 12:55 PM Barry Song <21cnbao@gmail.com>
> wrote:
> > > >
> > > > On Tue, Aug 27, 2024 at 4:37 AM Lokesh Gidra <lokeshgidra@google.com>
> > > wrote:
> > > > >
> > > > > Thanks Suren for looping in
> > > > >
> > > > > On Fri, Aug 23, 2024 at 4:39 PM Suren Baghdasaryan
> <surenb@google.com>
> > > wrote:
> > > > > >
> > > > > > On Wed, Aug 21, 2024 at 2:47 PM Barry Song <21cnbao@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > On Wed, Aug 21, 2024 at 8:46 PM Michal Hocko
> <mhocko@suse.com>
> > > wrote:
> > > > > > > >
> > > > > > > > On Fri 16-08-24 07:48:01, gaoxu wrote:
> > > > > > > > > Replace lruvec_add_folio with lruvec_add_folio_tail in the
> > > lru_lazyfree_fn:
> > > > > > > > > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If
> it's
> > > > > > > > >    moved to the LRU tail, it allows for faster release lazy-free
> folio
> > > and
> > > > > > > > >    reduces the impact on file refault.
> > > > > > > >
> > > > > > > > This has been discussed when MADV_FREE was introduced. The
> > > question was
> > > > > > > > whether this memory has a lower priority than other inactive
> memory
> > > that
> > > > > > > > has been marked that way longer ago. Also consider several
> > > MADV_FREE
> > > > > > > > users should they be LIFO from the reclaim POV?
> > > > >
> > > > > Thinking from the user's perspective, it seems to me that FIFO within
> > > > > MADV_FREE'ed pages makes more sense. As a user I expect the longer a
> > > > > MADV_FREE'ed page hasn't been touched, the chances are higher that it
> > > > > may not be around anymore.
> > > > > > >
> > > >
> > > > Hi Lokesh,
> > > > Thanks!
> > > >
> > > > > > > The priority of this memory compared to other inactive memory that
> has
> > > been
> > > > > > > marked for a longer time likely depends on the user's expectations -
> How
> > > soon
> > > > > > > do users expect MADV_FREE to be reclaimed compared with old file
> > > folios.
> > > > > > >
> > > > > > > art guys moved to MADV_FREE from MADV_DONTNEED without any
> > > > > > > useful performance data and reason in the changelog:
> > > > > > > https://android-review.googlesource.com/c/platform/art/+/2633132
> > > > > > >
> > > > > > > Since art is the Android Java heap, it can be quite large. This increases
> the
> > > > > > > likelihood of packing the file LRU and reduces the chances of
> reclaiming
> > > > > > > anonymous memory, which could result in more file re-faults while
> > > helping
> > > > > > > anonymous folio persist longer in memory.
> > > > >
> > > > > Individual heaps of android apps are not big, and even in there we
> > > > > don't call MADV_FREE on the entire heap.
> > > >
> > > > How do you define "Individual heaps of android apps", do you know the
> usual
> > > > total_size for a phone with memory pressure by running multiple apps and
> > > how
> > > > much for each app?
> > > >
> > > Every app is a separate process and therefore has its own private ART
> > > heap. Those numbers that you are asking vary drastically. But here's
> > > what I can tell you:
> > >
> > > Max heap size for an app is 512MB typically. But it is rarely entirely
> > > used. Typical heap usage is 50MB to 250MB. But as I said, not all of
> > > it is MADV_FREE'ed. Only those pages which are freed after GC
> > > compaction are.
> > > > > > >
> > > > > > > I am really curious why art guys have moved to MADV_FREE if we
> have
> > > > > > > an approach to reach them.
> > > > >
> > > > > Honestly, it makes little sense as a user that calling MADV_FREE on an
> > > > > anonymous mapping will impact file LRU. That was never the intention
> > > > > with our ART change.
> > > > >
> > > >
> > > > This is just how MADV_FREE is implemented in the kernel, this kind of
> lazyfree
> > > > anon folios are moved to file but *NOT* anon LRU.
> > > >
> > > > > From our perspective, once a set of pages are MADV_FREE'ed, they are
> > > > > like a page-cache. It gives an opportunity, without hurting memory
> > > > > use, to avoid overhead of page-faults, which happen frequently after
> > > > > GC is done on running apps.
> > > > >
> > > > > IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
> > > > > prioritized for reclamation over file ones.
> > > >
> > > > This is exactly what this patch is doing, putting lazyfree anon folios
> > > > to the tail of file LRU so that they can be reclaimed earlier than file
> > > > folios. But the question is: is the requirement "MADV_FREE'ed pages
> > > > should be prioritized for reclamation over file ones" universally true for
> > > > all other non-Android users?
> > > >
> > > That's definitely an important question to get answered. But putting
> > > my users hat on again, by explicitly MADV_FREE'ing we ask for that
> > > behavior. IMHO, MADV_FREE'ed pages should be the first ones to be
> > > reclaimed on memory pressure.
> > For non-Android systems, perhaps the author of MADV_FREE can provide a
> more
> > reasonable opinion;
> >
> > Add Minchan Kim.
> > Please forgive me for forgetting to add you when sending the patch.
> 
> AFAIR, there were two concerns:
> 
> 1. The file LRU would contain pages used only once.
> 
> While MADV_FREE allows discarding pages under memory pressure, the system
> would
> still have non-working set pages within the file LRU (e.g., those used only once).
> 
> 
> 2. LRU inversion among MADV_FREE users.
> 
> Consider this time order:
> 
> 1. A process: MADV_FREE
> 2. B process: MADV_FREE
> 3. C process: MADV_FREE
> 
> The moving tail approach would discard the most recent pages from Process C
> first,
> instead of those from Process A.
> 
> Of course, this isn't universally true for all workloads, but it's the reality.
After enabling MGLRU, the implementation of age and evict based on gen dilutes the FIFO mechanism. Although the joining time points are different, they are all reclaimed based on the same gen. 
Android has always been plagued by performance issues caused by high IO. Many engineers adjust strategies to prefer reclaiming anon when the system is low on memory. For the same reason, 
we believe lazy free folio should prioritize file reclamation.(If I misunderstood, please correct me.)

For other discussions that lean towards reclaiming anon folio, please refer to:
https://patchwork.kernel.org/project/linux-mm/cover/20231108065818.19932-1-link@vivo.com/

Adding lazyfree folio to the LRU tail has no impact on the Android system, allowing the system to normally utilize the reuse of MADV_FREE when not in a low mem state.
If added to the file LRU head, the Android system will encounter various issues such as high IO and heavy kswapd load, forcing us to prohibit Android ART from continuing to use MADV_FREE.
Adding lazyfree folio to the LRU tail is not the best approach, but it is more acceptable compared to adding it to the LRU head.

> 
> At the time, I proposed introducing an additional "ez_reclaimable" LRU list to
> store MADV_FREE pages
> (and potentially other hinted pages in the future).
> This would allow differentiating priority among LRU lists based on knobs or
> heuristics.
This solution looks good, might need to think about how to adapt it to mglru.
> However, this idea wasn't well-received.
Andrew Morton Sept. 9, 2024, 10:22 p.m. UTC | #20
On Tue, 27 Aug 2024 10:29:11 +0800 Hailong Liu <hailong.liu@oppo.com> wrote:

> On Tue, 27. Aug 14:18, Barry Song wrote:
> > On Tue, Aug 27, 2024 at 2:13 PM Hailong Liu <hailong.liu@oppo.com> wrote:
> > >
> > > On Mon, 26. Aug 09:37, Lokesh Gidra wrote:
> > > >
> > > > IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
> > > > prioritized for reclamation over file ones.
> > >
> > >
> > > > >
> > > > > Adding Lokesh.
> > > > > Lokesh, could you please comment on the reasoning behind the above
> > > > > mentioned change?
> > > >
> > > > Adding Nicolas as well, in case he wants to add something.
> > > IMHO, lruvec_add_folio is enough. if lruvec_add_folio_tail why not use
> > > MADV_DONTNEED instead? In MM the reclaim policy prefer to reclaim file cache, if
> > > MADV_FREE'd pages directly add to the tail, they might be reclaimed instantly.
> > > Also the benefit of workingset_refault_file cannot be convinced for me.
> >
> > My understanding is that MADV_DONTNEED will immediately free the memory,
> > whereas MADV_FREE will release memory only under memory pressure. If
> > memory pressure is low, the anonymous memory may still be gotten back
> > without causing page faults. This might be what Lokesh is aiming to achieve.
> >
> Hmm, IIUC, for the reason of watermark, the kswapd would do reclamation without
> memory pressure. I worried here is that the madv_free'd pages reclaimed too fast
> if add to tail.

I'm not seeing much clarity on whether we should merge this change.  I
think I'll drop this version - let's please revisit after -rc1.
Michal Hocko Sept. 10, 2024, 8:20 a.m. UTC | #21
On Mon 09-09-24 15:22:15, Andrew Morton wrote:
[...]
> I'm not seeing much clarity on whether we should merge this change.  I
> think I'll drop this version - let's please revisit after -rc1.

I think the biggest thing to focus on is the change of the behavior and
whether LIFO aging strategy for MADV_FREE pages is really desirable. It
is a noticeable LRU antipattern.
Barry Song Sept. 10, 2024, 8:51 a.m. UTC | #22
On Thu, Aug 29, 2024 at 3:55 PM gaoxu <gaoxu2@honor.com> wrote:
>
> > On Tue, Aug 27, 2024 at 04:07:57AM +0000, gaoxu wrote:
> > > >
> > > > On Mon, Aug 26, 2024 at 12:55 PM Barry Song <21cnbao@gmail.com>
> > wrote:
> > > > >
> > > > > On Tue, Aug 27, 2024 at 4:37 AM Lokesh Gidra <lokeshgidra@google.com>
> > > > wrote:
> > > > > >
> > > > > > Thanks Suren for looping in
> > > > > >
> > > > > > On Fri, Aug 23, 2024 at 4:39 PM Suren Baghdasaryan
> > <surenb@google.com>
> > > > wrote:
> > > > > > >
> > > > > > > On Wed, Aug 21, 2024 at 2:47 PM Barry Song <21cnbao@gmail.com>
> > > > wrote:
> > > > > > > >
> > > > > > > > On Wed, Aug 21, 2024 at 8:46 PM Michal Hocko
> > <mhocko@suse.com>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > On Fri 16-08-24 07:48:01, gaoxu wrote:
> > > > > > > > > > Replace lruvec_add_folio with lruvec_add_folio_tail in the
> > > > lru_lazyfree_fn:
> > > > > > > > > > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If
> > it's
> > > > > > > > > >    moved to the LRU tail, it allows for faster release lazy-free
> > folio
> > > > and
> > > > > > > > > >    reduces the impact on file refault.
> > > > > > > > >
> > > > > > > > > This has been discussed when MADV_FREE was introduced. The
> > > > question was
> > > > > > > > > whether this memory has a lower priority than other inactive
> > memory
> > > > that
> > > > > > > > > has been marked that way longer ago. Also consider several
> > > > MADV_FREE
> > > > > > > > > users should they be LIFO from the reclaim POV?
> > > > > >
> > > > > > Thinking from the user's perspective, it seems to me that FIFO within
> > > > > > MADV_FREE'ed pages makes more sense. As a user I expect the longer a
> > > > > > MADV_FREE'ed page hasn't been touched, the chances are higher that it
> > > > > > may not be around anymore.
> > > > > > > >
> > > > >
> > > > > Hi Lokesh,
> > > > > Thanks!
> > > > >
> > > > > > > > The priority of this memory compared to other inactive memory that
> > has
> > > > been
> > > > > > > > marked for a longer time likely depends on the user's expectations -
> > How
> > > > soon
> > > > > > > > do users expect MADV_FREE to be reclaimed compared with old file
> > > > folios.
> > > > > > > >
> > > > > > > > art guys moved to MADV_FREE from MADV_DONTNEED without any
> > > > > > > > useful performance data and reason in the changelog:
> > > > > > > > https://android-review.googlesource.com/c/platform/art/+/2633132
> > > > > > > >
> > > > > > > > Since art is the Android Java heap, it can be quite large. This increases
> > the
> > > > > > > > likelihood of packing the file LRU and reduces the chances of
> > reclaiming
> > > > > > > > anonymous memory, which could result in more file re-faults while
> > > > helping
> > > > > > > > anonymous folio persist longer in memory.
> > > > > >
> > > > > > Individual heaps of android apps are not big, and even in there we
> > > > > > don't call MADV_FREE on the entire heap.
> > > > >
> > > > > How do you define "Individual heaps of android apps", do you know the
> > usual
> > > > > total_size for a phone with memory pressure by running multiple apps and
> > > > how
> > > > > much for each app?
> > > > >
> > > > Every app is a separate process and therefore has its own private ART
> > > > heap. Those numbers that you are asking vary drastically. But here's
> > > > what I can tell you:
> > > >
> > > > Max heap size for an app is 512MB typically. But it is rarely entirely
> > > > used. Typical heap usage is 50MB to 250MB. But as I said, not all of
> > > > it is MADV_FREE'ed. Only those pages which are freed after GC
> > > > compaction are.
> > > > > > > >
> > > > > > > > I am really curious why art guys have moved to MADV_FREE if we
> > have
> > > > > > > > an approach to reach them.
> > > > > >
> > > > > > Honestly, it makes little sense as a user that calling MADV_FREE on an
> > > > > > anonymous mapping will impact file LRU. That was never the intention
> > > > > > with our ART change.
> > > > > >
> > > > >
> > > > > This is just how MADV_FREE is implemented in the kernel, this kind of
> > lazyfree
> > > > > anon folios are moved to file but *NOT* anon LRU.
> > > > >
> > > > > > From our perspective, once a set of pages are MADV_FREE'ed, they are
> > > > > > like a page-cache. It gives an opportunity, without hurting memory
> > > > > > use, to avoid overhead of page-faults, which happen frequently after
> > > > > > GC is done on running apps.
> > > > > >
> > > > > > IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
> > > > > > prioritized for reclamation over file ones.
> > > > >
> > > > > This is exactly what this patch is doing, putting lazyfree anon folios
> > > > > to the tail of file LRU so that they can be reclaimed earlier than file
> > > > > folios. But the question is: is the requirement "MADV_FREE'ed pages
> > > > > should be prioritized for reclamation over file ones" universally true for
> > > > > all other non-Android users?
> > > > >
> > > > That's definitely an important question to get answered. But putting
> > > > my users hat on again, by explicitly MADV_FREE'ing we ask for that
> > > > behavior. IMHO, MADV_FREE'ed pages should be the first ones to be
> > > > reclaimed on memory pressure.
> > > For non-Android systems, perhaps the author of MADV_FREE can provide a
> > more
> > > reasonable opinion;
> > >
> > > Add Minchan Kim.
> > > Please forgive me for forgetting to add you when sending the patch.
> >
> > AFAIR, there were two concerns:
> >
> > 1. The file LRU would contain pages used only once.
> >
> > While MADV_FREE allows discarding pages under memory pressure, the system
> > would
> > still have non-working set pages within the file LRU (e.g., those used only once).
> >
> >
> > 2. LRU inversion among MADV_FREE users.
> >
> > Consider this time order:
> >
> > 1. A process: MADV_FREE
> > 2. B process: MADV_FREE
> > 3. C process: MADV_FREE
> >
> > The moving tail approach would discard the most recent pages from Process C
> > first,
> > instead of those from Process A.
> >
> > Of course, this isn't universally true for all workloads, but it's the reality.
> After enabling MGLRU, the implementation of age and evict based on gen dilutes the FIFO mechanism. Although the joining time points are different, they are all reclaimed based on the same gen.
> Android has always been plagued by performance issues caused by high IO. Many engineers adjust strategies to prefer reclaiming anon when the system is low on memory. For the same reason,
> we believe lazy free folio should prioritize file reclamation.(If I misunderstood, please correct me.)
>
> For other discussions that lean towards reclaiming anon folio, please refer to:
> https://patchwork.kernel.org/project/linux-mm/cover/20231108065818.19932-1-link@vivo.com/
>
> Adding lazyfree folio to the LRU tail has no impact on the Android system, allowing the system to normally utilize the reuse of MADV_FREE when not in a low mem state.
> If added to the file LRU head, the Android system will encounter various issues such as high IO and heavy kswapd load, forcing us to prohibit Android ART from continuing to use MADV_FREE.
> Adding lazyfree folio to the LRU tail is not the best approach, but it is more acceptable compared to adding it to the LRU head.
>
> >
> > At the time, I proposed introducing an additional "ez_reclaimable" LRU list to
> > store MADV_FREE pages
> > (and potentially other hinted pages in the future).
> > This would allow differentiating priority among LRU lists based on knobs or
> > heuristics.
> This solution looks good, might need to think about how to adapt it to mglru.

That's right. Adapting to MGLRU isn't straightforward. We might need a separate
generation smaller than min_seq for this, or alternatively, it could
be handled by
a separate LRU list that isn't tied to any MGLRU generation. both seem hard.

> > However, this idea wasn't well-received.
diff mbox series

Patch

diff --git a/mm/swap.c b/mm/swap.c
index 6b838986d..e0dbfc983 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -641,7 +641,7 @@  static void lru_lazyfree(struct lruvec *lruvec, struct folio *folio)
 	 * anonymous folios
 	 */
 	folio_clear_swapbacked(folio);
-	lruvec_add_folio(lruvec, folio);
+	lruvec_add_folio_tail(lruvec, folio);
 
 	__count_vm_events(PGLAZYFREE, nr_pages);
 	__count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE, nr_pages);