diff mbox series

[RFC,01/10] autonuma: Fix watermark checking in migrate_balanced_pgdat()

Message ID 20191101075727.26683-2-ying.huang@intel.com (mailing list archive)
State New, archived
Headers show
Series autonuma: Optimize memory placement in memory tiering system | expand

Commit Message

Huang, Ying Nov. 1, 2019, 7:57 a.m. UTC
From: Huang Ying <ying.huang@intel.com>

When zone_watermark_ok() is called in migrate_balanced_pgdat() to
check migration target node, the parameter classzone_idx (for
requested zone) is specified as 0 (ZONE_DMA).  But when allocating
memory for autonuma in alloc_misplaced_dst_page(), the requested zone
from GFP flags is ZONE_MOVABLE.  That is, the requested zone is
different.  The size of lowmem_reserve for the different requested
zone is different.  And this may cause some issues.

For example, in the zoneinfo of a test machine as below,

Node 0, zone    DMA32
  pages free     61592
        min      29
        low      454
        high     879
        spanned  1044480
        present  442306
        managed  425921
        protection: (0, 0, 62457, 62457, 62457)

The free page number of ZONE_DMA32 is greater than "high watermark +
lowmem_reserve[ZONE_DMA]", but less than "high watermark +
lowmem_reserve[ZONE_MOVABLE]".  And because __alloc_pages_node() in
alloc_misplaced_dst_page() requests ZONE_MOVABLE, the
zone_watermark_ok() on ZONE_DMA32 in migrate_balanced_pgdat() may
always return true.  So, autonuma may not stop even when memory
pressure in node 0 is heavy.

To fix the issue, ZONE_MOVABLE is used as parameter to call
zone_watermark_ok() in migrate_balanced_pgdat().  This makes it same
as requested zone in alloc_misplaced_dst_page().  So that
migrate_balanced_pgdat() returns false when memory pressure is heavy.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
---
 mm/migrate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Mel Gorman Nov. 1, 2019, 11:11 a.m. UTC | #1
On Fri, Nov 01, 2019 at 03:57:18PM +0800, Huang, Ying wrote:
> From: Huang Ying <ying.huang@intel.com>
> 
> When zone_watermark_ok() is called in migrate_balanced_pgdat() to
> check migration target node, the parameter classzone_idx (for
> requested zone) is specified as 0 (ZONE_DMA).  But when allocating
> memory for autonuma in alloc_misplaced_dst_page(), the requested zone
> from GFP flags is ZONE_MOVABLE.  That is, the requested zone is
> different.  The size of lowmem_reserve for the different requested
> zone is different.  And this may cause some issues.
> 
> For example, in the zoneinfo of a test machine as below,
> 
> Node 0, zone    DMA32
>   pages free     61592
>         min      29
>         low      454
>         high     879
>         spanned  1044480
>         present  442306
>         managed  425921
>         protection: (0, 0, 62457, 62457, 62457)
> 
> The free page number of ZONE_DMA32 is greater than "high watermark +
> lowmem_reserve[ZONE_DMA]", but less than "high watermark +
> lowmem_reserve[ZONE_MOVABLE]".  And because __alloc_pages_node() in
> alloc_misplaced_dst_page() requests ZONE_MOVABLE, the
> zone_watermark_ok() on ZONE_DMA32 in migrate_balanced_pgdat() may
> always return true.  So, autonuma may not stop even when memory
> pressure in node 0 is heavy.
> 
> To fix the issue, ZONE_MOVABLE is used as parameter to call
> zone_watermark_ok() in migrate_balanced_pgdat().  This makes it same
> as requested zone in alloc_misplaced_dst_page().  So that
> migrate_balanced_pgdat() returns false when memory pressure is heavy.
> 
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>

Acked-by: Mel Gorman <mgorman@suse.de>

This patch is independent of the series and should be resent separately.
Alternatively Andrew, please pick this patch up on its own.
diff mbox series

Patch

diff --git a/mm/migrate.c b/mm/migrate.c
index 513107baccd3..8f06bd37d927 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1954,7 +1954,7 @@  static bool migrate_balanced_pgdat(struct pglist_data *pgdat,
 		if (!zone_watermark_ok(zone, 0,
 				       high_wmark_pages(zone) +
 				       nr_migrate_pages,
-				       0, 0))
+				       ZONE_MOVABLE, 0))
 			continue;
 		return true;
 	}