diff mbox series

[rfc,1/3] mm: prepare more high-order pages to be stored on the per-cpu lists

Message ID 20240415081220.3246839-2-wangkefeng.wang@huawei.com (mailing list archive)
State New
Headers show
Series mm: allow more high-order pages stored on PCP lists | expand

Commit Message

Kefeng Wang April 15, 2024, 8:12 a.m. UTC
Both the file pages and anonymous pages support large folio, high-order
pages except HPAGE_PMD_ORDER(PMD_SHIFT - PAGE_SHIFT) will be allocated
frequently which will increase the zone lock contention, allow high-order
pages on pcp lists could alleviate the big zone lock contention, in order
to allows high-orders(PAGE_ALLOC_COSTLY_ORDER, HPAGE_PMD_ORDER) to be
stored on the per-cpu lists, similar with PMD_ORDER pages, more lists is
added in struct per_cpu_pages (one list each high-order pages), also a
new PCP_MAX_ORDER instead of HPAGE_PMD_ORDER is added in mmzone.h.

But as commit 44042b449872 ("mm/page_alloc: allow high-order pages to be
stored on the per-cpu lists") pointed, it may not win in all the scenes,
so this don't allow higher-order pages to be added to PCP list, the next
will add a control to enable or disable it.

The struct per_cpu_pages increases in size from 256(4 cache lines) to
320 bytes (5 cache lines) on arm64 with defconfig.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/linux/mmzone.h |  4 +++-
 mm/page_alloc.c        | 10 +++++-----
 2 files changed, 8 insertions(+), 6 deletions(-)

Comments

Baolin Wang April 15, 2024, 11:41 a.m. UTC | #1
On 2024/4/15 16:12, Kefeng Wang wrote:
> Both the file pages and anonymous pages support large folio, high-order
> pages except HPAGE_PMD_ORDER(PMD_SHIFT - PAGE_SHIFT) will be allocated
> frequently which will increase the zone lock contention, allow high-order
> pages on pcp lists could alleviate the big zone lock contention, in order
> to allows high-orders(PAGE_ALLOC_COSTLY_ORDER, HPAGE_PMD_ORDER) to be
> stored on the per-cpu lists, similar with PMD_ORDER pages, more lists is
> added in struct per_cpu_pages (one list each high-order pages), also a
> new PCP_MAX_ORDER instead of HPAGE_PMD_ORDER is added in mmzone.h.
> 
> But as commit 44042b449872 ("mm/page_alloc: allow high-order pages to be
> stored on the per-cpu lists") pointed, it may not win in all the scenes,
> so this don't allow higher-order pages to be added to PCP list, the next
> will add a control to enable or disable it.
> 
> The struct per_cpu_pages increases in size from 256(4 cache lines) to
> 320 bytes (5 cache lines) on arm64 with defconfig.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>   include/linux/mmzone.h |  4 +++-
>   mm/page_alloc.c        | 10 +++++-----
>   2 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index c11b7cde81ef..c745e2f1a0f2 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -657,11 +657,13 @@ enum zone_watermarks {
>    * failures.
>    */
>   #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> -#define NR_PCP_THP 1
> +#define PCP_MAX_ORDER (PMD_SHIFT - PAGE_SHIFT)
> +#define NR_PCP_THP (PCP_MAX_ORDER - PAGE_ALLOC_COSTLY_ORDER)
>   #else
>   #define NR_PCP_THP 0
>   #endif
>   #define NR_LOWORDER_PCP_LISTS (MIGRATE_PCPTYPES * (PAGE_ALLOC_COSTLY_ORDER + 1))
> +#define HIGHORDER_PCP_LIST_INDEX (NR_LOWORDER_PCP_LISTS - (PAGE_ALLOC_COSTLY_ORDER + 1))

Thanks for starting the discussion.

I am concerned that mixing mTHPs of different migratetypes in a single 
pcp list might lead to fragmentation issues, potentially causing 
unmovable mTHPs to occupy movable pageblocks, which would reduce 
compaction efficiency.

But also not sure if it is suitable to add more pcp lists, maybe we can 
just add the most commonly used mTHP as a start, for example: 64K?
Kefeng Wang April 15, 2024, 12:25 p.m. UTC | #2
On 2024/4/15 19:41, Baolin Wang wrote:
> 
> 
> On 2024/4/15 16:12, Kefeng Wang wrote:
>> Both the file pages and anonymous pages support large folio, high-order
>> pages except HPAGE_PMD_ORDER(PMD_SHIFT - PAGE_SHIFT) will be allocated
>> frequently which will increase the zone lock contention, allow high-order
>> pages on pcp lists could alleviate the big zone lock contention, in order
>> to allows high-orders(PAGE_ALLOC_COSTLY_ORDER, HPAGE_PMD_ORDER) to be
>> stored on the per-cpu lists, similar with PMD_ORDER pages, more lists is
>> added in struct per_cpu_pages (one list each high-order pages), also a
>> new PCP_MAX_ORDER instead of HPAGE_PMD_ORDER is added in mmzone.h.
>>
>> But as commit 44042b449872 ("mm/page_alloc: allow high-order pages to be
>> stored on the per-cpu lists") pointed, it may not win in all the scenes,
>> so this don't allow higher-order pages to be added to PCP list, the next
>> will add a control to enable or disable it.
>>
>> The struct per_cpu_pages increases in size from 256(4 cache lines) to
>> 320 bytes (5 cache lines) on arm64 with defconfig.
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>>   include/linux/mmzone.h |  4 +++-
>>   mm/page_alloc.c        | 10 +++++-----
>>   2 files changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index c11b7cde81ef..c745e2f1a0f2 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -657,11 +657,13 @@ enum zone_watermarks {
>>    * failures.
>>    */
>>   #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>> -#define NR_PCP_THP 1
>> +#define PCP_MAX_ORDER (PMD_SHIFT - PAGE_SHIFT)
>> +#define NR_PCP_THP (PCP_MAX_ORDER - PAGE_ALLOC_COSTLY_ORDER)
>>   #else
>>   #define NR_PCP_THP 0
>>   #endif
>>   #define NR_LOWORDER_PCP_LISTS (MIGRATE_PCPTYPES * 
>> (PAGE_ALLOC_COSTLY_ORDER + 1))
>> +#define HIGHORDER_PCP_LIST_INDEX (NR_LOWORDER_PCP_LISTS - 
>> (PAGE_ALLOC_COSTLY_ORDER + 1))
> 
> Thanks for starting the discussion.
> 
> I am concerned that mixing mTHPs of different migratetypes in a single 
> pcp list might lead to fragmentation issues, potentially causing 
> unmovable mTHPs to occupy movable pageblocks, which would reduce 
> compaction efficiency.
> 

Yes, this is not enabled it by default.

> But also not sure if it is suitable to add more pcp lists, maybe we can 
> just add the most commonly used mTHP as a start, for example: 64K?

Do you mean only add only one list for 64K, I think it before, but it is
not true for all cases, maybe other order is most used in different
tests, so only enable specified  high-order by a pcp_enabled sysfs, but
it is certain that we need find a case to show improvement when use the 
high-order(eg,order4 = 64K) on pcp list.
diff mbox series

Patch

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index c11b7cde81ef..c745e2f1a0f2 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -657,11 +657,13 @@  enum zone_watermarks {
  * failures.
  */
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-#define NR_PCP_THP 1
+#define PCP_MAX_ORDER (PMD_SHIFT - PAGE_SHIFT)
+#define NR_PCP_THP (PCP_MAX_ORDER - PAGE_ALLOC_COSTLY_ORDER)
 #else
 #define NR_PCP_THP 0
 #endif
 #define NR_LOWORDER_PCP_LISTS (MIGRATE_PCPTYPES * (PAGE_ALLOC_COSTLY_ORDER + 1))
+#define HIGHORDER_PCP_LIST_INDEX (NR_LOWORDER_PCP_LISTS - (PAGE_ALLOC_COSTLY_ORDER + 1))
 #define NR_PCP_LISTS (NR_LOWORDER_PCP_LISTS + NR_PCP_THP)
 
 #define min_wmark_pages(z) (z->_watermark[WMARK_MIN] + z->watermark_boost)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b51becf03d1e..2248afc7b73a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -506,8 +506,8 @@  static inline unsigned int order_to_pindex(int migratetype, int order)
 {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	if (order > PAGE_ALLOC_COSTLY_ORDER) {
-		VM_BUG_ON(order != HPAGE_PMD_ORDER);
-		return NR_LOWORDER_PCP_LISTS;
+		VM_BUG_ON(order > PCP_MAX_ORDER);
+		return order + HIGHORDER_PCP_LIST_INDEX;
 	}
 #else
 	VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER);
@@ -521,8 +521,8 @@  static inline int pindex_to_order(unsigned int pindex)
 	int order = pindex / MIGRATE_PCPTYPES;
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	if (pindex == NR_LOWORDER_PCP_LISTS)
-		order = HPAGE_PMD_ORDER;
+	if (pindex >= NR_LOWORDER_PCP_LISTS)
+		order = pindex - HIGHORDER_PCP_LIST_INDEX;
 #else
 	VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER);
 #endif
@@ -535,7 +535,7 @@  static inline bool pcp_allowed_order(unsigned int order)
 	if (order <= PAGE_ALLOC_COSTLY_ORDER)
 		return true;
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	if (order == HPAGE_PMD_ORDER)
+	if (order == PCP_MAX_ORDER)
 		return true;
 #endif
 	return false;