diff mbox series

mm, compaction: fix 'limit' in fast_isolate_freepages

Message ID 20210620145742.54565-1-vvghjk1234@gmail.com (mailing list archive)
State New
Headers show
Series mm, compaction: fix 'limit' in fast_isolate_freepages | expand

Commit Message

Wonhyuk Yang June 20, 2021, 2:57 p.m. UTC
Because of 'min(1, ...)', fast_isolate_freepages set 'limit'
to 0 or 1. This takes away the opportunities of find candinate
pages. Also, even if 'limit' reaches zero, it scan once. It is
not consistent. So, modify the minimum value of 'limit' to 1.

Fixes: 5a811889de10f ("mm, compaction: use free lists to quickly locate a migration target")

Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
---
 mm/compaction.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Mel Gorman June 23, 2021, 9:15 a.m. UTC | #1
On Sun, Jun 20, 2021 at 11:57:42PM +0900, Wonhyuk Yang wrote:
> Because of 'min(1, ...)', fast_isolate_freepages set 'limit'
> to 0 or 1. This takes away the opportunities of find candinate
> pages. Also, even if 'limit' reaches zero, it scan once. It is
> not consistent. So, modify the minimum value of 'limit' to 1.
> 

The changelog could do with a little polish.

In addition, what were the effects of this and what load did you use to
evaluate it? While your patch is mostly correct, it has the potential
side-effect of increasing system CPU usage in some cases and I'm curious
to hear what you observed. Minimally it is worth noting in the changelog
that there is a risk of increasing system CPU usage but that there are
advantages too. Describe them in the changelog in case a regression
bisects to your patch.

> Fixes: 5a811889de10f ("mm, compaction: use free lists to quickly locate a migration target")
> 
> Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
> ---
>  mm/compaction.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 84fde270ae74..2e41e7ab1f55 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1380,7 +1380,7 @@ static int next_search_order(struct compact_control *cc, int order)
>  static unsigned long
>  fast_isolate_freepages(struct compact_control *cc)
>  {
> -	unsigned int limit = min(1U, freelist_scan_limit(cc) >> 1);
> +	unsigned int limit = max(1U, freelist_scan_limit(cc) >> 1);
>  	unsigned int nr_scanned = 0;
>  	unsigned long low_pfn, min_pfn, highest = 0;
>  	unsigned long nr_isolated = 0;

Ok.

> @@ -1456,7 +1456,7 @@ fast_isolate_freepages(struct compact_control *cc)
>  				high_pfn = pfn;
>  
>  				/* Shorten the scan if a candidate is found */
> -				limit >>= 1;
> +				limit = max(1U, limit >> 1);
>  			}
>  
>  			if (order_scanned >= limit)

This hunk should be dropped. Once a candidate free page has been
identified, it's ok to decay the limit to 0. This hunk introduces a risk
of increasing system CPU usage unnecessarily.

> @@ -1496,7 +1496,7 @@ fast_isolate_freepages(struct compact_control *cc)
>  		 * to freelist_scan_limit.
>  		 */
>  		if (order_scanned >= limit)
> -			limit = min(1U, limit >> 1);
> +			limit = max(1U, limit >> 1);
>  	}

The change is fine but I have a minor nitpick that you are free to
ignore. The comment above this block has a typo.

s/scan ig related/scan is related/

Ordinarily patches to fix spelling are ignored but you are altering this
area anyway and it's helpful to see the full comment when reviewing this
patch. I think it would be harmless to fix the spelling in the context
of this patch.

Thanks.
Wonhyuk Yang June 24, 2021, 2:18 p.m. UTC | #2
On Wed, Jun 23, 2021 at 6:15 PM Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Sun, Jun 20, 2021 at 11:57:42PM +0900, Wonhyuk Yang wrote:
> > Because of 'min(1, ...)', fast_isolate_freepages set 'limit'
> > to 0 or 1. This takes away the opportunities of find candinate
> > pages. Also, even if 'limit' reaches zero, it scan once. It is
> > not consistent. So, modify the minimum value of 'limit' to 1.
> >
>
> The changelog could do with a little polish.
>
> In addition, what were the effects of this and what load did you use to
> evaluate it? While your patch is mostly correct, it has the potential
> side-effect of increasing system CPU usage in some cases and I'm curious
> to hear what you observed. Minimally it is worth noting in the changelog
> that there is a risk of increasing system CPU usage but that there are
> advantages too. Describe them in the changelog in case a regression
> bisects to your patch.

I tested it on the thpscale and the results are as follows.

                                                                5.12
                           5.12
                                                             vanilla
                     patched
Amean     fault-both-1          598.15 (   0.00%)         592.56 (   0.93%)
Amean     fault-both-3        1494.47 (   0.00%)       1514.35 (  -1.33%)
Amean     fault-both-5        2519.48 (   0.00%)       2471.76 (   1.89%)
Amean     fault-both-7        3173.85 (   0.00%)       3079.19 (   2.98%)
Amean     fault-both-12      8063.83 (   0.00%)       7858.29 (   2.55%)
Amean     fault-both-18      8781.20 (   0.00%)      7827.70 *  10.86%*
Amean     fault-both-24    12576.44 (   0.00%)     12250.20 (   2.59%)
Amean     fault-both-30    18503.27 (   0.00%)     17528.11 *   5.27%*
Amean     fault-both-32    16133.69 (   0.00%)    13874.24 *  14.00%*


  5.12                  5.12

vanilla            patched
Ops Compaction migrate scanned       6547133.00      5963901.00
Ops Compaction free scanned           32452453.00    26609101.00

One thing to worry about is that the results are very different every time.
Is there any precise way to measure this patch?

> > @@ -1456,7 +1456,7 @@ fast_isolate_freepages(struct compact_control *cc)
> >                               high_pfn = pfn;
> >
> >                               /* Shorten the scan if a candidate is found */
> > -                             limit >>= 1;
> > +                             limit = max(1U, limit >> 1);
> >                       }
> >
> >                       if (order_scanned >= limit)
>
> This hunk should be dropped. Once a candidate free page has been
> identified, it's ok to decay the limit to 0. This hunk introduces a risk
> of increasing system CPU usage unnecessarily.

Yes, you are right. I'll take your opinion.

> > @@ -1496,7 +1496,7 @@ fast_isolate_freepages(struct compact_control *cc)
> >                * to freelist_scan_limit.
> >                */
> >               if (order_scanned >= limit)
> > -                     limit = min(1U, limit >> 1);
> > +                     limit = max(1U, limit >> 1);
> >       }
>
> The change is fine but I have a minor nitpick that you are free to
> ignore. The comment above this block has a typo.
>
> s/scan ig related/scan is related/
>
> Ordinarily patches to fix spelling are ignored but you are altering this
> area anyway and it's helpful to see the full comment when reviewing this
> patch. I think it would be harmless to fix the spelling in the context
> of this patch.

Okay, I'll fix this as well.

Thank you for your review.
Mel Gorman June 25, 2021, 10:21 a.m. UTC | #3
On Thu, Jun 24, 2021 at 11:18:57PM +0900, Wonhyuk Yang wrote:
> On Wed, Jun 23, 2021 at 6:15 PM Mel Gorman <mgorman@techsingularity.net> wrote:
> >
> > On Sun, Jun 20, 2021 at 11:57:42PM +0900, Wonhyuk Yang wrote:
> > > Because of 'min(1, ...)', fast_isolate_freepages set 'limit'
> > > to 0 or 1. This takes away the opportunities of find candinate
> > > pages. Also, even if 'limit' reaches zero, it scan once. It is
> > > not consistent. So, modify the minimum value of 'limit' to 1.
> > >
> >
> > The changelog could do with a little polish.
> >
> > In addition, what were the effects of this and what load did you use to
> > evaluate it? While your patch is mostly correct, it has the potential
> > side-effect of increasing system CPU usage in some cases and I'm curious
> > to hear what you observed. Minimally it is worth noting in the changelog
> > that there is a risk of increasing system CPU usage but that there are
> > advantages too. Describe them in the changelog in case a regression
> > bisects to your patch.
> 
> I tested it on the thpscale and the results are as follows.
> 
>                                                                 5.12
>                            5.12
>                                                              vanilla
>                      patched
> Amean     fault-both-1          598.15 (   0.00%)         592.56 (   0.93%)
> Amean     fault-both-3        1494.47 (   0.00%)       1514.35 (  -1.33%)
> Amean     fault-both-5        2519.48 (   0.00%)       2471.76 (   1.89%)
> Amean     fault-both-7        3173.85 (   0.00%)       3079.19 (   2.98%)
> Amean     fault-both-12      8063.83 (   0.00%)       7858.29 (   2.55%)
> Amean     fault-both-18      8781.20 (   0.00%)      7827.70 *  10.86%*
> Amean     fault-both-24    12576.44 (   0.00%)     12250.20 (   2.59%)
> Amean     fault-both-30    18503.27 (   0.00%)     17528.11 *   5.27%*
> Amean     fault-both-32    16133.69 (   0.00%)    13874.24 *  14.00%*
> 
> 
>   5.12                  5.12
> 
> vanilla            patched
> Ops Compaction migrate scanned       6547133.00      5963901.00
> Ops Compaction free scanned           32452453.00    26609101.00
> 

Ok, mention this in the changelog and maybe include the overall system
CPU usage as well. It will be higher but should be acceptable.

> One thing to worry about is that the results are very different every time.
> Is there any precise way to measure this patch?
> 

Not with this workload, it was designed to simply hammer compaction
heavily to see if latencies were unacceptably high and also for tracing
various compaction corner cases.

> > > @@ -1456,7 +1456,7 @@ fast_isolate_freepages(struct compact_control *cc)
> > >                               high_pfn = pfn;
> > >
> > >                               /* Shorten the scan if a candidate is found */
> > > -                             limit >>= 1;
> > > +                             limit = max(1U, limit >> 1);
> > >                       }
> > >
> > >                       if (order_scanned >= limit)
> >
> > This hunk should be dropped. Once a candidate free page has been
> > identified, it's ok to decay the limit to 0. This hunk introduces a risk
> > of increasing system CPU usage unnecessarily.
> 
> Yes, you are right. I'll take your opinion.
> 

Thanks.

> > > @@ -1496,7 +1496,7 @@ fast_isolate_freepages(struct compact_control *cc)
> > >                * to freelist_scan_limit.
> > >                */
> > >               if (order_scanned >= limit)
> > > -                     limit = min(1U, limit >> 1);
> > > +                     limit = max(1U, limit >> 1);
> > >       }
> >
> > The change is fine but I have a minor nitpick that you are free to
> > ignore. The comment above this block has a typo.
> >
> > s/scan ig related/scan is related/
> >
> > Ordinarily patches to fix spelling are ignored but you are altering this
> > area anyway and it's helpful to see the full comment when reviewing this
> > patch. I think it would be harmless to fix the spelling in the context
> > of this patch.
> 
> Okay, I'll fix this as well.
> 
> Thank you for your review.

No problem, thank you for the patch. Please cc me on v2 and I'll rerun
some tests just to be sure before acking it.
Wonhyuk Yang June 26, 2021, 7:17 a.m. UTC | #4
On Fri, Jun 25, 2021 at 7:21 PM Mel Gorman <mgorman@techsingularity.net> wrote:
> On Thu, Jun 24, 2021 at 11:18:57PM +0900, Wonhyuk Yang wrote:
> > On Wed, Jun 23, 2021 at 6:15 PM Mel Gorman <mgorman@techsingularity.net> wrote:
> > >
> > > On Sun, Jun 20, 2021 at 11:57:42PM +0900, Wonhyuk Yang wrote:
> > > > Because of 'min(1, ...)', fast_isolate_freepages set 'limit'
> > > > to 0 or 1. This takes away the opportunities of find candinate
> > > > pages. Also, even if 'limit' reaches zero, it scan once. It is
> > > > not consistent. So, modify the minimum value of 'limit' to 1.
> > > >
> > >
> > > The changelog could do with a little polish.
> > >
> > > In addition, what were the effects of this and what load did you use to
> > > evaluate it? While your patch is mostly correct, it has the potential
> > > side-effect of increasing system CPU usage in some cases and I'm curious
> > > to hear what you observed. Minimally it is worth noting in the changelog
> > > that there is a risk of increasing system CPU usage but that there are
> > > advantages too. Describe them in the changelog in case a regression
> > > bisects to your patch.
> >
> > I tested it on the thpscale and the results are as follows.
> >
> >                                                                 5.12
> >                            5.12
> >                                                              vanilla
> >                      patched
> > Amean     fault-both-1          598.15 (   0.00%)         592.56 (   0.93%)
> > Amean     fault-both-3        1494.47 (   0.00%)       1514.35 (  -1.33%)
> > Amean     fault-both-5        2519.48 (   0.00%)       2471.76 (   1.89%)
> > Amean     fault-both-7        3173.85 (   0.00%)       3079.19 (   2.98%)
> > Amean     fault-both-12      8063.83 (   0.00%)       7858.29 (   2.55%)
> > Amean     fault-both-18      8781.20 (   0.00%)      7827.70 *  10.86%*
> > Amean     fault-both-24    12576.44 (   0.00%)     12250.20 (   2.59%)
> > Amean     fault-both-30    18503.27 (   0.00%)     17528.11 *   5.27%*
> > Amean     fault-both-32    16133.69 (   0.00%)    13874.24 *  14.00%*
> >
> >
> >   5.12                  5.12
> >
> > vanilla            patched
> > Ops Compaction migrate scanned       6547133.00      5963901.00
> > Ops Compaction free scanned           32452453.00    26609101.00
> >
>
> Ok, mention this in the changelog and maybe include the overall system
> CPU usage as well. It will be higher but should be acceptable.
>
> > One thing to worry about is that the results are very different every time.
> > Is there any precise way to measure this patch?
> >
>
> Not with this workload, it was designed to simply hammer compaction
> heavily to see if latencies were unacceptably high and also for tracing
> various compaction corner cases.
>
> > > > @@ -1456,7 +1456,7 @@ fast_isolate_freepages(struct compact_control *cc)
> > > >                               high_pfn = pfn;
> > > >
> > > >                               /* Shorten the scan if a candidate is found */
> > > > -                             limit >>= 1;
> > > > +                             limit = max(1U, limit >> 1);
> > > >                       }
> > > >
> > > >                       if (order_scanned >= limit)
> > >
> > > This hunk should be dropped. Once a candidate free page has been
> > > identified, it's ok to decay the limit to 0. This hunk introduces a risk
> > > of increasing system CPU usage unnecessarily.
> >
> > Yes, you are right. I'll take your opinion.
> >
>
> Thanks.
>
> > > > @@ -1496,7 +1496,7 @@ fast_isolate_freepages(struct compact_control *cc)
> > > >                * to freelist_scan_limit.
> > > >                */
> > > >               if (order_scanned >= limit)
> > > > -                     limit = min(1U, limit >> 1);
> > > > +                     limit = max(1U, limit >> 1);
> > > >       }
> > >
> > > The change is fine but I have a minor nitpick that you are free to
> > > ignore. The comment above this block has a typo.
> > >
> > > s/scan ig related/scan is related/
> > >
> > > Ordinarily patches to fix spelling are ignored but you are altering this
> > > area anyway and it's helpful to see the full comment when reviewing this
> > > patch. I think it would be harmless to fix the spelling in the context
> > > of this patch.
> >
> > Okay, I'll fix this as well.
> >
> > Thank you for your review.
>
> No problem, thank you for the patch. Please cc me on v2 and I'll rerun
> some tests just to be sure before acking it.
>

Okay, I'll do that.
diff mbox series

Patch

diff --git a/mm/compaction.c b/mm/compaction.c
index 84fde270ae74..2e41e7ab1f55 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1380,7 +1380,7 @@  static int next_search_order(struct compact_control *cc, int order)
 static unsigned long
 fast_isolate_freepages(struct compact_control *cc)
 {
-	unsigned int limit = min(1U, freelist_scan_limit(cc) >> 1);
+	unsigned int limit = max(1U, freelist_scan_limit(cc) >> 1);
 	unsigned int nr_scanned = 0;
 	unsigned long low_pfn, min_pfn, highest = 0;
 	unsigned long nr_isolated = 0;
@@ -1456,7 +1456,7 @@  fast_isolate_freepages(struct compact_control *cc)
 				high_pfn = pfn;
 
 				/* Shorten the scan if a candidate is found */
-				limit >>= 1;
+				limit = max(1U, limit >> 1);
 			}
 
 			if (order_scanned >= limit)
@@ -1496,7 +1496,7 @@  fast_isolate_freepages(struct compact_control *cc)
 		 * to freelist_scan_limit.
 		 */
 		if (order_scanned >= limit)
-			limit = min(1U, limit >> 1);
+			limit = max(1U, limit >> 1);
 	}
 
 	if (!page) {