diff mbox series

[v4,05/11] mm/migrate: clear __GFP_RECLAIM for THP allocation for migration

Message ID 1594107889-32228-6-git-send-email-iamjoonsoo.kim@lge.com (mailing list archive)
State New, archived
Headers show
Series clean-up the migration target allocation functions | expand

Commit Message

Joonsoo Kim July 7, 2020, 7:44 a.m. UTC
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

In mm/migrate.c, THP allocation for migration is called with the provided
gfp_mask | GFP_TRANSHUGE. This gfp_mask contains __GFP_RECLAIM and it
would be conflict with the intention of the GFP_TRANSHUGE.

GFP_TRANSHUGE/GFP_TRANSHUGE_LIGHT is introduced to control the reclaim
behaviour by well defined manner since overhead of THP allocation is
quite large and the whole system could suffer from it. So, they deals
with __GFP_RECLAIM mask deliberately. If gfp_mask contains __GFP_RECLAIM
and uses gfp_mask | GFP_TRANSHUGE(_LIGHT) for THP allocation, it means
that it breaks the purpose of the GFP_TRANSHUGE(_LIGHT).

This patch fixes this situation by clearing __GFP_RECLAIM in provided
gfp_mask. Note that there are some other THP allocations for migration
and they just uses GFP_TRANSHUGE(_LIGHT) directly. This patch would make
all THP allocation for migration consistent.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/migrate.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Michal Hocko July 7, 2020, 11:40 a.m. UTC | #1
On Tue 07-07-20 16:44:43, Joonsoo Kim wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> In mm/migrate.c, THP allocation for migration is called with the provided
> gfp_mask | GFP_TRANSHUGE. This gfp_mask contains __GFP_RECLAIM and it
> would be conflict with the intention of the GFP_TRANSHUGE.
> 
> GFP_TRANSHUGE/GFP_TRANSHUGE_LIGHT is introduced to control the reclaim
> behaviour by well defined manner since overhead of THP allocation is
> quite large and the whole system could suffer from it. So, they deals
> with __GFP_RECLAIM mask deliberately. If gfp_mask contains __GFP_RECLAIM
> and uses gfp_mask | GFP_TRANSHUGE(_LIGHT) for THP allocation, it means
> that it breaks the purpose of the GFP_TRANSHUGE(_LIGHT).

GFP_TRANSHUGE* is not a carved in stone design. Their primary reason to
exist is to control how hard to try for different allocation
paths/configurations because their latency expectations might be
largerly different. It is mostly the #PF path which aims to be as
lightweight as possible I believe nobody simply considered migration to be
very significant to even care. And I am still not sure it matters but
I would tend to agree that a consistency here is probably a very minor
plus.

Your changelog is slightly misleading in that regard because it suggests
that this is a real problem while it doesn't present any actual data.
It would be really nice to make the effective change really stand out.
We are only talking about __GFP_RECLAIM_KSWAPD here. So the only
difference is that the migration won't wake up kswapd now.

All that being said the changelog should be probably more explicit about
the fact that this is solely done for consistency and be honest that the
runtime effect is not really clear. This would help people reading it in
future.
Vlastimil Babka July 7, 2020, 12:17 p.m. UTC | #2
On 7/7/20 9:44 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> In mm/migrate.c, THP allocation for migration is called with the provided
> gfp_mask | GFP_TRANSHUGE. This gfp_mask contains __GFP_RECLAIM and it
> would be conflict with the intention of the GFP_TRANSHUGE.
> 
> GFP_TRANSHUGE/GFP_TRANSHUGE_LIGHT is introduced to control the reclaim
> behaviour by well defined manner since overhead of THP allocation is
> quite large and the whole system could suffer from it. So, they deals
> with __GFP_RECLAIM mask deliberately. If gfp_mask contains __GFP_RECLAIM
> and uses gfp_mask | GFP_TRANSHUGE(_LIGHT) for THP allocation, it means
> that it breaks the purpose of the GFP_TRANSHUGE(_LIGHT).
> 
> This patch fixes this situation by clearing __GFP_RECLAIM in provided
> gfp_mask. Note that there are some other THP allocations for migration
> and they just uses GFP_TRANSHUGE(_LIGHT) directly. This patch would make
> all THP allocation for migration consistent.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  mm/migrate.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 02b31fe..ecd7615 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1547,6 +1547,11 @@ struct page *new_page_nodemask(struct page *page,
>  	}
>  
>  	if (PageTransHuge(page)) {
> +		/*
> +		 * clear __GFP_RECALIM since GFP_TRANSHUGE is the gfp_mask
> +		 * that chooses the reclaim masks deliberately.
> +		 */
> +		gfp_mask &= ~__GFP_RECLAIM;
>  		gfp_mask |= GFP_TRANSHUGE;

In addition to what Michal said...

The mask is not passed to this function, so I would just redefine it, as is done
in the hugetlb case. We probably don't even need the __GFP_RETRY_MAYFAIL for the
THP case asi it's just there to prevent OOM kill (per commit 0f55685627d6d ) and
the costly order of THP is enough for that.

>  		order = HPAGE_PMD_ORDER;
>  	}
>
Joonsoo Kim July 8, 2020, 7:17 a.m. UTC | #3
On Tue, Jul 07, 2020 at 02:17:55PM +0200, Vlastimil Babka wrote:
> On 7/7/20 9:44 AM, js1304@gmail.com wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > In mm/migrate.c, THP allocation for migration is called with the provided
> > gfp_mask | GFP_TRANSHUGE. This gfp_mask contains __GFP_RECLAIM and it
> > would be conflict with the intention of the GFP_TRANSHUGE.
> > 
> > GFP_TRANSHUGE/GFP_TRANSHUGE_LIGHT is introduced to control the reclaim
> > behaviour by well defined manner since overhead of THP allocation is
> > quite large and the whole system could suffer from it. So, they deals
> > with __GFP_RECLAIM mask deliberately. If gfp_mask contains __GFP_RECLAIM
> > and uses gfp_mask | GFP_TRANSHUGE(_LIGHT) for THP allocation, it means
> > that it breaks the purpose of the GFP_TRANSHUGE(_LIGHT).
> > 
> > This patch fixes this situation by clearing __GFP_RECLAIM in provided
> > gfp_mask. Note that there are some other THP allocations for migration
> > and they just uses GFP_TRANSHUGE(_LIGHT) directly. This patch would make
> > all THP allocation for migration consistent.
> > 
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > ---
> >  mm/migrate.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 02b31fe..ecd7615 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1547,6 +1547,11 @@ struct page *new_page_nodemask(struct page *page,
> >  	}
> >  
> >  	if (PageTransHuge(page)) {
> > +		/*
> > +		 * clear __GFP_RECALIM since GFP_TRANSHUGE is the gfp_mask
> > +		 * that chooses the reclaim masks deliberately.
> > +		 */
> > +		gfp_mask &= ~__GFP_RECLAIM;
> >  		gfp_mask |= GFP_TRANSHUGE;
> 
> In addition to what Michal said...
> 
> The mask is not passed to this function, so I would just redefine it, as is done
> in the hugetlb case. We probably don't even need the __GFP_RETRY_MAYFAIL for the
> THP case asi it's just there to prevent OOM kill (per commit 0f55685627d6d ) and
> the costly order of THP is enough for that.

Will check.

Thanks.
Joonsoo Kim July 8, 2020, 7:19 a.m. UTC | #4
On Tue, Jul 07, 2020 at 01:40:19PM +0200, Michal Hocko wrote:
> On Tue 07-07-20 16:44:43, Joonsoo Kim wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > In mm/migrate.c, THP allocation for migration is called with the provided
> > gfp_mask | GFP_TRANSHUGE. This gfp_mask contains __GFP_RECLAIM and it
> > would be conflict with the intention of the GFP_TRANSHUGE.
> > 
> > GFP_TRANSHUGE/GFP_TRANSHUGE_LIGHT is introduced to control the reclaim
> > behaviour by well defined manner since overhead of THP allocation is
> > quite large and the whole system could suffer from it. So, they deals
> > with __GFP_RECLAIM mask deliberately. If gfp_mask contains __GFP_RECLAIM
> > and uses gfp_mask | GFP_TRANSHUGE(_LIGHT) for THP allocation, it means
> > that it breaks the purpose of the GFP_TRANSHUGE(_LIGHT).
> 
> GFP_TRANSHUGE* is not a carved in stone design. Their primary reason to
> exist is to control how hard to try for different allocation
> paths/configurations because their latency expectations might be
> largerly different. It is mostly the #PF path which aims to be as
> lightweight as possible I believe nobody simply considered migration to be
> very significant to even care. And I am still not sure it matters but
> I would tend to agree that a consistency here is probably a very minor
> plus.
> 
> Your changelog is slightly misleading in that regard because it suggests
> that this is a real problem while it doesn't present any actual data.
> It would be really nice to make the effective change really stand out.
> We are only talking about __GFP_RECLAIM_KSWAPD here. So the only
> difference is that the migration won't wake up kswapd now.
> 
> All that being said the changelog should be probably more explicit about
> the fact that this is solely done for consistency and be honest that the
> runtime effect is not really clear. This would help people reading it in
> future.

Okay. How about following changelog?

Thanks.

----------->8--------------------
Subject: [PATCH] mm/migrate: clear __GFP_RECLAIM for THP allocation for
 migration

In migration target allocation functions, THP allocations uses different
gfp_mask, especially, in regard to the reclaim gfp_mask. There is no
reason to use different reclaim gfp_mask for each cases and it is
an obstacle to make a common function in order to clean-up migration
target allocation functions. This patch fixes this situation by using
common reclaim gfp_mask for THP allocation.
Michal Hocko July 8, 2020, 7:48 a.m. UTC | #5
On Wed 08-07-20 16:19:17, Joonsoo Kim wrote:
> On Tue, Jul 07, 2020 at 01:40:19PM +0200, Michal Hocko wrote:
[...]
> Subject: [PATCH] mm/migrate: clear __GFP_RECLAIM for THP allocation for
>  migration
> 
> In migration target allocation functions, THP allocations uses different
> gfp_mask, especially, in regard to the reclaim gfp_mask. There is no
> reason to use different reclaim gfp_mask for each cases and it is
> an obstacle to make a common function in order to clean-up migration
> target allocation functions. This patch fixes this situation by using
> common reclaim gfp_mask for THP allocation.

I would find the following more understandable, feel free to reuse parts
that you like:
"
new_page_nodemask is a migration callback and it tries to use a common
gfp flags for the target page allocation whether it is a base page or a
THP. The later only adds GFP_TRANSHUGE to the given mask. This results
in the allocation being slightly more aggressive than necessary because
the resulting gfp mask will contain also __GFP_RECLAIM_KSWAPD. THP
allocations usually exclude this flag to reduce over eager background
reclaim during a high THP allocation load which has been seen during
large mmaps initialization. There is no indication that this is a
problem for migration as well but theoretically the same might happen
when migrating large mappings to a different node. Make the migration
callback consistent with regular THP allocations.
"
Joonsoo Kim July 9, 2020, 3:26 a.m. UTC | #6
2020년 7월 8일 (수) 오후 4:48, Michal Hocko <mhocko@kernel.org>님이 작성:
>
> On Wed 08-07-20 16:19:17, Joonsoo Kim wrote:
> > On Tue, Jul 07, 2020 at 01:40:19PM +0200, Michal Hocko wrote:
> [...]
> > Subject: [PATCH] mm/migrate: clear __GFP_RECLAIM for THP allocation for
> >  migration
> >
> > In migration target allocation functions, THP allocations uses different
> > gfp_mask, especially, in regard to the reclaim gfp_mask. There is no
> > reason to use different reclaim gfp_mask for each cases and it is
> > an obstacle to make a common function in order to clean-up migration
> > target allocation functions. This patch fixes this situation by using
> > common reclaim gfp_mask for THP allocation.
>
> I would find the following more understandable, feel free to reuse parts
> that you like:
> "
> new_page_nodemask is a migration callback and it tries to use a common
> gfp flags for the target page allocation whether it is a base page or a
> THP. The later only adds GFP_TRANSHUGE to the given mask. This results
> in the allocation being slightly more aggressive than necessary because
> the resulting gfp mask will contain also __GFP_RECLAIM_KSWAPD. THP
> allocations usually exclude this flag to reduce over eager background
> reclaim during a high THP allocation load which has been seen during
> large mmaps initialization. There is no indication that this is a
> problem for migration as well but theoretically the same might happen
> when migrating large mappings to a different node. Make the migration
> callback consistent with regular THP allocations.
> "

Looks good!
I will use this description.

Thanks.
Joonsoo Kim July 9, 2020, 7:17 a.m. UTC | #7
2020년 7월 7일 (화) 오후 9:17, Vlastimil Babka <vbabka@suse.cz>님이 작성:
>
> On 7/7/20 9:44 AM, js1304@gmail.com wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> >
> > In mm/migrate.c, THP allocation for migration is called with the provided
> > gfp_mask | GFP_TRANSHUGE. This gfp_mask contains __GFP_RECLAIM and it
> > would be conflict with the intention of the GFP_TRANSHUGE.
> >
> > GFP_TRANSHUGE/GFP_TRANSHUGE_LIGHT is introduced to control the reclaim
> > behaviour by well defined manner since overhead of THP allocation is
> > quite large and the whole system could suffer from it. So, they deals
> > with __GFP_RECLAIM mask deliberately. If gfp_mask contains __GFP_RECLAIM
> > and uses gfp_mask | GFP_TRANSHUGE(_LIGHT) for THP allocation, it means
> > that it breaks the purpose of the GFP_TRANSHUGE(_LIGHT).
> >
> > This patch fixes this situation by clearing __GFP_RECLAIM in provided
> > gfp_mask. Note that there are some other THP allocations for migration
> > and they just uses GFP_TRANSHUGE(_LIGHT) directly. This patch would make
> > all THP allocation for migration consistent.
> >
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > ---
> >  mm/migrate.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 02b31fe..ecd7615 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1547,6 +1547,11 @@ struct page *new_page_nodemask(struct page *page,
> >       }
> >
> >       if (PageTransHuge(page)) {
> > +             /*
> > +              * clear __GFP_RECALIM since GFP_TRANSHUGE is the gfp_mask
> > +              * that chooses the reclaim masks deliberately.
> > +              */
> > +             gfp_mask &= ~__GFP_RECLAIM;
> >               gfp_mask |= GFP_TRANSHUGE;
>
> In addition to what Michal said...
>
> The mask is not passed to this function, so I would just redefine it, as is done
> in the hugetlb case. We probably don't even need the __GFP_RETRY_MAYFAIL for the
> THP case asi it's just there to prevent OOM kill (per commit 0f55685627d6d ) and
> the costly order of THP is enough for that.

As I said in another reply, provided __GFP_THISNODE should be handled
so just redefining it would not work.

Thanks.
diff mbox series

Patch

diff --git a/mm/migrate.c b/mm/migrate.c
index 02b31fe..ecd7615 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1547,6 +1547,11 @@  struct page *new_page_nodemask(struct page *page,
 	}
 
 	if (PageTransHuge(page)) {
+		/*
+		 * clear __GFP_RECALIM since GFP_TRANSHUGE is the gfp_mask
+		 * that chooses the reclaim masks deliberately.
+		 */
+		gfp_mask &= ~__GFP_RECLAIM;
 		gfp_mask |= GFP_TRANSHUGE;
 		order = HPAGE_PMD_ORDER;
 	}