diff mbox series

[mm-unstable] mm/damon/core: increase regions merge aggressiveness while respecting min_nr_regions

Message ID 20240626164753.46270-1-sj@kernel.org (mailing list archive)
State New
Headers show
Series [mm-unstable] mm/damon/core: increase regions merge aggressiveness while respecting min_nr_regions | expand

Commit Message

SeongJae Park June 26, 2024, 4:47 p.m. UTC
DAMON's merge mechanism has two thresholds, namely those for access
frequency and size.  The access frequency threshold avoids merging two
adjacent regions that having pretty different access frequency.

The size threshold is calculated as total size of regions divided by
min_nr_regions.  Merging operation skip merging two adjacent regions if
the resulting region's size can be larger than the threshold.  This is
for meeting min_nr_regions.

Commit 44fdaf596984 ("mm/damon/core: merge regions aggressively when
max_nr_regions is unmet") of mm-unstable, however, ignores the
min_nr_regions by increasing not only access frequency threshold but
also the size threshold.

The commit also has one more problem.  User could set DAMON target
regions with more than max_nr_regions discrete regions.  Because DAMON
cannot merge non-adjacent regions, the number of regions will never be
lower than max_nr_regions regardless of the increased thresholds.  As a
result, the function can infinitely repeat the loop.

Increase only access frequency threshold, up to only possible maximum
value.

Fixes: 44fdaf596984 ("mm/damon/core: merge regions aggressively when max_nr_regions is unmet") # mm-unstable
Signed-off-by: SeongJae Park <sj@kernel.org>
---
 mm/damon/core.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

Comments

SeongJae Park June 26, 2024, 9:49 p.m. UTC | #1
On Wed, 26 Jun 2024 09:47:53 -0700 SeongJae Park <sj@kernel.org> wrote:

> DAMON's merge mechanism has two thresholds, namely those for access
> frequency and size.  The access frequency threshold avoids merging two
> adjacent regions that having pretty different access frequency.
> 
> The size threshold is calculated as total size of regions divided by
> min_nr_regions.  Merging operation skip merging two adjacent regions if
> the resulting region's size can be larger than the threshold.  This is
> for meeting min_nr_regions.
> 
> Commit 44fdaf596984 ("mm/damon/core: merge regions aggressively when
> max_nr_regions is unmet") of mm-unstable, however, ignores the
> min_nr_regions by increasing not only access frequency threshold but
> also the size threshold.
> 
> The commit also has one more problem.  User could set DAMON target
> regions with more than max_nr_regions discrete regions.  Because DAMON
> cannot merge non-adjacent regions, the number of regions will never be
> lower than max_nr_regions regardless of the increased thresholds.  As a
> result, the function can infinitely repeat the loop.
> 
> Increase only access frequency threshold, up to only possible maximum
> value.
> 
> Fixes: 44fdaf596984 ("mm/damon/core: merge regions aggressively when max_nr_regions is unmet") # mm-unstable
> Signed-off-by: SeongJae Park <sj@kernel.org>
> ---
>  mm/damon/core.c | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index e6598c44b53c..dac27b949403 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
[...]
> +	max_thres = c->attrs.aggr_interval /
> +		(c->attrs.sample_interval ?  c->attrs.sample_interval : 1);
>  	do {
>  		nr_regions = 0;
>  		damon_for_each_target(t, c) {
> @@ -1716,8 +1717,8 @@ static void kdamond_merge_regions(struct damon_ctx *c, unsigned int threshold,
>  			nr_regions += damon_nr_regions(t);
>  		}
>  		threshold = max(1, threshold * 2);
> -		sz_limit = max(1, sz_limit * 2);
> -	} while (nr_regions > c->attrs.max_nr_regions);
> +	} while (nr_regions > c->attrs.max_nr_regions &&
> +			threshold <= max_thres);

This code means that kdamond_merge_regions() stops this repeated merge attempt
if the merge threshold that increased for next attempt is higher than the
possible maximum threshold.  And because the increase of the threshold is made
by picking a maximum value between one and the last-used threshold multiplying
two, the merge attempt with maximum threshold will not be made unless both the
maximum threshold and the threshold to increase are powers of two.  In maximum
situation (e.g., region 1 has 100% access frequency, region 2 has 0% access
frequency, so on), this means the max_nr_regions violation cannot be recovered
by the attempts.

This can be fixed by changing it to stop repeated attempt if the last-used
threshold is same to or higher than the maximum possible threshold, like below.

I'll send the fix of the fix as a formal patch soon.

FYI, the original fix is definitely better to be merged in stable kernels, but
not urgent in my opinion, since the problematic case is not common and the
behavior was same since the beginning of DAMON.  Andrew, if you feel the
original fix is not stable yet, please feel free to delay moving it to
hotfix-stable for one week or two.


Thanks,
SJ

================================= >8 ==========================================
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1773,7 +1773,7 @@ static void kdamond_merge_regions(struct damon_ctx *c, unsigned int threshold,
                }
                threshold = max(1, threshold * 2);
        } while (nr_regions > c->attrs.max_nr_regions &&
-                       threshold <= max_thres);
+                       threshold / 2 < max_thres);
 }

 /*
Andrew Morton June 26, 2024, 10:05 p.m. UTC | #2
On Wed, 26 Jun 2024 14:49:54 -0700 SeongJae Park <sj@kernel.org> wrote:

> > @@ -1716,8 +1717,8 @@ static void kdamond_merge_regions(struct damon_ctx *c, unsigned int threshold,
> >  			nr_regions += damon_nr_regions(t);
> >  		}
> >  		threshold = max(1, threshold * 2);
> > -		sz_limit = max(1, sz_limit * 2);
> > -	} while (nr_regions > c->attrs.max_nr_regions);
> > +	} while (nr_regions > c->attrs.max_nr_regions &&
> > +			threshold <= max_thres);
> 
> This code means that kdamond_merge_regions() stops this repeated merge attempt
> if the merge threshold that increased for next attempt is higher than the
> possible maximum threshold.  And because the increase of the threshold is made
> by picking a maximum value between one and the last-used threshold multiplying
> two, the merge attempt with maximum threshold will not be made unless both the
> maximum threshold and the threshold to increase are powers of two.  In maximum
> situation (e.g., region 1 has 100% access frequency, region 2 has 0% access
> frequency, so on), this means the max_nr_regions violation cannot be recovered
> by the attempts.
> 
> This can be fixed by changing it to stop repeated attempt if the last-used
> threshold is same to or higher than the maximum possible threshold, like below.
> 
> I'll send the fix of the fix as a formal patch soon.
> 
> FYI, the original fix is definitely better to be merged in stable kernels, but
> not urgent in my opinion, since the problematic case is not common and the
> behavior was same since the beginning of DAMON.  Andrew, if you feel the
> original fix is not stable yet, please feel free to delay moving it to
> hotfix-stable for one week or two.

That's fine - we can merge cc:stable patches any time, really - they
will still get backported.  There's only a hurry to get fixes merged up
if they're security-related or if the issue is causing people problems.

In this case I'll await your -fix-2.patch and we can merge the patch
next week.
SeongJae Park June 27, 2024, 4:33 p.m. UTC | #3
Hi Andrew,

On Wed, 26 Jun 2024 15:05:16 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:

> On Wed, 26 Jun 2024 14:49:54 -0700 SeongJae Park <sj@kernel.org> wrote:
> 
[...]
> > I'll send the fix of the fix as a formal patch soon.
> > 
> > FYI, the original fix is definitely better to be merged in stable kernels, but
> > not urgent in my opinion, since the problematic case is not common and the
> > behavior was same since the beginning of DAMON.  Andrew, if you feel the
> > original fix is not stable yet, please feel free to delay moving it to
> > hotfix-stable for one week or two.
> 
> That's fine - we can merge cc:stable patches any time, really - they
> will still get backported.  There's only a hurry to get fixes merged up
> if they're security-related or if the issue is causing people problems.
> 
> In this case I'll await your -fix-2.patch and we can merge the patch
> next week.

All sounds good, thank you Andrew!  I just posted the -fix-2.patch:
https://lore.kernel.org/20240627163153.75969-1-sj@kernel.org


Thanks,
SJ
diff mbox series

Patch

diff --git a/mm/damon/core.c b/mm/damon/core.c
index e6598c44b53c..dac27b949403 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1695,20 +1695,21 @@  static void damon_merge_regions_of(struct damon_target *t, unsigned int thres,
  * overhead under the dynamically changeable access pattern.  If a merge was
  * unnecessarily made, later 'kdamond_split_regions()' will revert it.
  *
- * The total number of regions could be temporarily higher than the
- * user-defined limit, max_nr_regions for some cases.  For an example, the user
- * updates max_nr_regions to a number that lower than the current number of
- * regions while DAMON is running.  Depending on the access pattern, it could
- * take indefinitve time to reduce the number below the limit.  For such a
- * case, repeat merging until the limit is met while increasing @threshold and
- * @sz_limit.
+ * The total number of regions could be higher than the user-defined limit,
+ * max_nr_regions for some cases.  For example, the user can update
+ * max_nr_regions to a number that lower than the current number of regions
+ * while DAMON is running.  For such a case, repeat merging until the limit is
+ * met while increasing @threshold up to possible maximum level.
  */
 static void kdamond_merge_regions(struct damon_ctx *c, unsigned int threshold,
 				  unsigned long sz_limit)
 {
 	struct damon_target *t;
 	unsigned int nr_regions;
+	unsigned int max_thres;
 
+	max_thres = c->attrs.aggr_interval /
+		(c->attrs.sample_interval ?  c->attrs.sample_interval : 1);
 	do {
 		nr_regions = 0;
 		damon_for_each_target(t, c) {
@@ -1716,8 +1717,8 @@  static void kdamond_merge_regions(struct damon_ctx *c, unsigned int threshold,
 			nr_regions += damon_nr_regions(t);
 		}
 		threshold = max(1, threshold * 2);
-		sz_limit = max(1, sz_limit * 2);
-	} while (nr_regions > c->attrs.max_nr_regions);
+	} while (nr_regions > c->attrs.max_nr_regions &&
+			threshold <= max_thres);
 }
 
 /*