From patchwork Wed Sep 21 06:06:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 12983238 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8614ECAAD8 for ; Wed, 21 Sep 2022 06:06:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BEAE26B0073; Wed, 21 Sep 2022 02:06:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AAE8B6B0074; Wed, 21 Sep 2022 02:06:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 976E3940007; Wed, 21 Sep 2022 02:06:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8995E6B0073 for ; Wed, 21 Sep 2022 02:06:55 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 449C4140F18 for ; Wed, 21 Sep 2022 06:06:55 +0000 (UTC) X-FDA: 79935059190.24.20BCED8 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf02.hostedemail.com (Postfix) with ESMTP id AD1AF8000E for ; Wed, 21 Sep 2022 06:06:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663740414; x=1695276414; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=P89pEx/QWVFzErE/u067pCGTE2O1BWblDLS8JJgMTCo=; b=NPJc4zWXq4Rke5uHzHcvH4SzrfzfqSxOfxo90ark/n/jfd6xej8QmD7W 9ooj7PYwJHcxQOwu6HTKh+0duLC2RwO1rgX/Wt/SVtephPcKnn+SFR68S iMPg2miGiW3QLeNpqNvDIhaH/t0qaTrJ04QfiLvizt9Zw80GcoNISO3N2 8Kx8rUZmVcdXBc5n3r9Ndc8pXAEuNxvhVRu0q0HKibkxNjDTV4EXDRz2q k3tg7YgLV0UJMX8CrMnGrmDy1fcz9PALMUY5kxQMN0Zkti9K38lvZ6EkI qWGIQ2xsMtMHToip/ruYlvwZ1F3NsgSxNYU5cjP3wMXBIxxPJqHJjjrBI w==; X-IronPort-AV: E=McAfee;i="6500,9779,10476"; a="282956817" X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="282956817" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:06:53 -0700 X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="649913842" Received: from yhuang6-mobl2.sh.intel.com ([10.238.5.245]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:06:50 -0700 From: Huang Ying To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Huang Ying , Zi Yan , Yang Shi , Baolin Wang , Oscar Salvador , Matthew Wilcox Subject: [RFC 1/6] mm/migrate_pages: separate huge page and normal pages migration Date: Wed, 21 Sep 2022 14:06:11 +0800 Message-Id: <20220921060616.73086-2-ying.huang@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220921060616.73086-1-ying.huang@intel.com> References: <20220921060616.73086-1-ying.huang@intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663740414; a=rsa-sha256; cv=none; b=g4sNiuw5wuULQrjIP1gIoWoejf2LYMcsHoIaouF6FcsuqYS1I1fKeO8NZ+xUI3Ty35gI1W yAxJPcxPB+QwVaFJ+yThdulkrnQGcI3HW0YOlp/O3+dEp+s2qiUHu/FzVHxLEmK7Lz8FoV CL6rOry7+FimovEE14JcuhmHyKPj/TQ= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=NPJc4zWX; spf=pass (imf02.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663740414; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iEWjYvdzXc766o9gXUnYzm/lZBr/zsno7aHFZjyF2lk=; b=R05rPvL2TEFS5n11up952uhSI6wajFTnppMkMOJS+NJD5tUD9QZG/f3bzkiU59A3t7Afu6 dAav4pfZXso3TCpOE2EttuLBJ5vijGLqvbi+VmFQOvMb+wKwpCfpzYQvHsYznU2/McsEJo kDurQ9Hr/zlLHPgI8FVOr4GemjgUjk0= X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: AD1AF8000E Authentication-Results: imf02.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=NPJc4zWX; spf=pass (imf02.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Stat-Signature: u4eaq3esucgp4ysbhz8g5nr1w4w374u7 X-HE-Tag: 1663740414-619721 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a preparation patch to batch the page unmapping and moving for the normal pages and THPs. Based on that we can batch the TLB shootdown during the page migration and make it possible to use some hardware accelerator for the page copying. In this patch the huge page (PageHuge()) and normal page and THP migration is separated in migrate_pages() to make it easy to change the normal page and THP migration implementation. Signed-off-by: "Huang, Ying" Cc: Zi Yan Cc: Yang Shi Cc: Baolin Wang Cc: Oscar Salvador Cc: Matthew Wilcox --- mm/migrate.c | 73 +++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 64 insertions(+), 9 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 571d8c9fd5bc..117134f1c6dc 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1414,6 +1414,66 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, trace_mm_migrate_pages_start(mode, reason); + for (pass = 0; pass < 10 && retry; pass++) { + retry = 0; + + list_for_each_entry_safe(page, page2, from, lru) { + nr_subpages = compound_nr(page); + cond_resched(); + + if (!PageHuge(page)) + continue; + + rc = unmap_and_move_huge_page(get_new_page, + put_new_page, private, page, + pass > 2, mode, reason, + &ret_pages); + /* + * The rules are: + * Success: hugetlb page will be put back + * -EAGAIN: stay on the from list + * -ENOMEM: stay on the from list + * -ENOSYS: stay on the from list + * Other errno: put on ret_pages list then splice to + * from list + */ + switch(rc) { + case -ENOSYS: + /* Hugetlb migration is unsupported */ + nr_failed++; + nr_failed_pages += nr_subpages; + list_move_tail(&page->lru, &ret_pages); + break; + case -ENOMEM: + /* + * When memory is low, don't bother to try to migrate + * other pages, just exit. + */ + nr_failed++; + nr_failed_pages += nr_subpages + nr_retry_pages; + goto out; + case -EAGAIN: + retry++; + nr_retry_pages += nr_subpages; + break; + case MIGRATEPAGE_SUCCESS: + nr_succeeded += nr_subpages; + break; + default: + /* + * Permanent failure (-EBUSY, etc.): + * unlike -EAGAIN case, the failed page is + * removed from migration page list and not + * retried in the next outer loop. + */ + nr_failed++; + nr_failed_pages += nr_subpages; + break; + } + } + } + nr_failed += retry; + retry = 1; thp_subpage_migration: for (pass = 0; pass < 10 && (retry || thp_retry); pass++) { retry = 0; @@ -1431,18 +1491,14 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, cond_resched(); if (PageHuge(page)) - rc = unmap_and_move_huge_page(get_new_page, - put_new_page, private, page, - pass > 2, mode, reason, - &ret_pages); - else - rc = unmap_and_move(get_new_page, put_new_page, + continue; + + rc = unmap_and_move(get_new_page, put_new_page, private, page, pass > 2, mode, reason, &ret_pages); /* * The rules are: - * Success: non hugetlb page will be freed, hugetlb - * page will be put back + * Success: page will be freed * -EAGAIN: stay on the from list * -ENOMEM: stay on the from list * -ENOSYS: stay on the from list @@ -1468,7 +1524,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, nr_thp_split++; break; } - /* Hugetlb migration is unsupported */ } else if (!no_subpage_counting) { nr_failed++; } From patchwork Wed Sep 21 06:06:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 12983239 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1F07C6FA82 for ; Wed, 21 Sep 2022 06:06:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 386B36B0074; Wed, 21 Sep 2022 02:06:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 336416B0075; Wed, 21 Sep 2022 02:06:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1DABE940007; Wed, 21 Sep 2022 02:06:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 09DEF6B0074 for ; Wed, 21 Sep 2022 02:06:58 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D8F40C0E94 for ; Wed, 21 Sep 2022 06:06:57 +0000 (UTC) X-FDA: 79935059274.29.D922FAB Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf28.hostedemail.com (Postfix) with ESMTP id 53CFCC000B for ; Wed, 21 Sep 2022 06:06:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663740417; x=1695276417; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MrLzv8Dvhk4JFF3omMcaNuJrUuoPDcPH4H019ndQy9c=; b=TNkKvhtHDZ0dCvYDnd42nk4/PEPywiFh3yQvkLdAHqqLj7VcJJnKoX28 2td7va5qfMuQt5KYG7HEnf1nq2SqDg9rL/HHTwsxEPCSiqBjc/qvYe+Wq bMi+2NRG5P3OKWGzbLLoFbH2k+A8YvBsMzZoSJsAQmaUHEEuGhnrwhd4K r/vlBbR7Jv31wbeOqEXcBrDRGsWpFtR74obmLYwZsIR+1S8Rrd66GI2Lh SgBmUr15tA5u40D5R1Uol0i2HIxfa7yMzzWGUXrmRnFBdw1kTI3hmF2Jf R9XxT+tT26Q/PbsEBmP2f8XkttTZWDeQAxcQ/Lj5N0JLJ/HVsfRPwB/T9 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10476"; a="282956828" X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="282956828" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:06:56 -0700 X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="649913857" Received: from yhuang6-mobl2.sh.intel.com ([10.238.5.245]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:06:53 -0700 From: Huang Ying To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Huang Ying , Zi Yan , Yang Shi , Baolin Wang , Oscar Salvador , Matthew Wilcox Subject: [RFC 2/6] mm/migrate_pages: split unmap_and_move() to _unmap() and _move() Date: Wed, 21 Sep 2022 14:06:12 +0800 Message-Id: <20220921060616.73086-3-ying.huang@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220921060616.73086-1-ying.huang@intel.com> References: <20220921060616.73086-1-ying.huang@intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663740417; a=rsa-sha256; cv=none; b=q+WvEStoVxxFKdZXwNCDpMyhaEz1OW9Hcqq8QvlszgXERc53uCy4wveJXEjfDaXfrj95dq YjLjmn99MWQf5qQNJGCnJu6HtIoxir2FqSJ6ZZGXCsoE+xTHykEwDjyJa8JrejNQ4/YiEx 1tNlZPYtXfACEOmNc+KLSExDrQw5ldc= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=TNkKvhtH; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663740417; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YhOrW8clb/xp3zjLXza2l9qmoH+9cdUtn9OqP87Xt2c=; b=6LIaJ4zpaPMUZnm4SM3znGT2/kThu2U4vhVaHA8U/XjLwerTpwesu+3ldx2rgNeoXVkx/R 9ja+et1d8Od/WOa1U7E9voE8Bp69MkwgjKdmfZh28kq802nsGkHuvxLuuvN3caQnHBSGNe INSLKnfVWULJwT9j3GwrK0O1aiIaLiE= X-Rspamd-Server: rspam06 X-Rspam-User: X-Stat-Signature: 87oumgkozyswmfzfneywphc5w5dbka9d X-Rspamd-Queue-Id: 53CFCC000B Authentication-Results: imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=TNkKvhtH; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-HE-Tag: 1663740417-436345 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a preparation patch to batch the page unmapping and moving for the normal pages and THP. In this patch, unmap_and_move() is split to migrate_page_unmap() and migrate_page_move(). So, we can batch _unmap() and _move() in different loops later. To pass some information between unmap and move, the original unused newpage->mapping and newpage->private are used. Signed-off-by: "Huang, Ying" Cc: Zi Yan Cc: Yang Shi Cc: Baolin Wang Cc: Oscar Salvador Cc: Matthew Wilcox Reviewed-by: Baolin Wang --- mm/migrate.c | 164 ++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 122 insertions(+), 42 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 117134f1c6dc..4a81e0bfdbcd 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -976,13 +976,32 @@ static int move_to_new_folio(struct folio *dst, struct folio *src, return rc; } -static int __unmap_and_move(struct page *page, struct page *newpage, +static void __migrate_page_record(struct page *newpage, + int page_was_mapped, + struct anon_vma *anon_vma) +{ + newpage->mapping = (struct address_space *)anon_vma; + newpage->private = page_was_mapped; +} + +static void __migrate_page_extract(struct page *newpage, + int *page_was_mappedp, + struct anon_vma **anon_vmap) +{ + *anon_vmap = (struct anon_vma *)newpage->mapping; + *page_was_mappedp = newpage->private; + newpage->mapping = NULL; + newpage->private = 0; +} + +#define MIGRATEPAGE_UNMAP 1 + +static int __migrate_page_unmap(struct page *page, struct page *newpage, int force, enum migrate_mode mode) { struct folio *folio = page_folio(page); - struct folio *dst = page_folio(newpage); int rc = -EAGAIN; - bool page_was_mapped = false; + int page_was_mapped = 0; struct anon_vma *anon_vma = NULL; bool is_lru = !__PageMovable(page); @@ -1058,8 +1077,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage, goto out_unlock; if (unlikely(!is_lru)) { - rc = move_to_new_folio(dst, folio, mode); - goto out_unlock_both; + __migrate_page_record(newpage, page_was_mapped, anon_vma); + return MIGRATEPAGE_UNMAP; } /* @@ -1085,11 +1104,41 @@ static int __unmap_and_move(struct page *page, struct page *newpage, VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma, page); try_to_migrate(folio, 0); - page_was_mapped = true; + page_was_mapped = 1; + } + + if (!page_mapped(page)) { + __migrate_page_record(newpage, page_was_mapped, anon_vma); + return MIGRATEPAGE_UNMAP; } - if (!page_mapped(page)) - rc = move_to_new_folio(dst, folio, mode); + if (page_was_mapped) + remove_migration_ptes(folio, folio, false); + +out_unlock_both: + unlock_page(newpage); +out_unlock: + /* Drop an anon_vma reference if we took one */ + if (anon_vma) + put_anon_vma(anon_vma); + unlock_page(page); +out: + + return rc; +} + +static int __migrate_page_move(struct page *page, struct page *newpage, + enum migrate_mode mode) +{ + struct folio *folio = page_folio(page); + struct folio *dst = page_folio(newpage); + int rc; + int page_was_mapped = 0; + struct anon_vma *anon_vma = NULL; + + __migrate_page_extract(newpage, &page_was_mapped, &anon_vma); + + rc = move_to_new_folio(dst, folio, mode); /* * When successful, push newpage to LRU immediately: so that if it @@ -1110,14 +1159,11 @@ static int __unmap_and_move(struct page *page, struct page *newpage, remove_migration_ptes(folio, rc == MIGRATEPAGE_SUCCESS ? dst : folio, false); -out_unlock_both: unlock_page(newpage); -out_unlock: /* Drop an anon_vma reference if we took one */ if (anon_vma) put_anon_vma(anon_vma); unlock_page(page); -out: /* * If migration is successful, decrease refcount of the newpage, * which will not free the page because new page owner increased @@ -1129,18 +1175,31 @@ static int __unmap_and_move(struct page *page, struct page *newpage, return rc; } -/* - * Obtain the lock on page, remove all ptes and migrate the page - * to the newly allocated page in newpage. - */ -static int unmap_and_move(new_page_t get_new_page, - free_page_t put_new_page, - unsigned long private, struct page *page, - int force, enum migrate_mode mode, - enum migrate_reason reason, - struct list_head *ret) +static void migrate_page_done(struct page *page, + enum migrate_reason reason) +{ + /* + * Compaction can migrate also non-LRU pages which are + * not accounted to NR_ISOLATED_*. They can be recognized + * as __PageMovable + */ + if (likely(!__PageMovable(page))) + mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON + + page_is_file_lru(page), -thp_nr_pages(page)); + + if (reason != MR_MEMORY_FAILURE) + /* We release the page in page_handle_poison. */ + put_page(page); +} + +/* Obtain the lock on page, remove all ptes. */ +static int migrate_page_unmap(new_page_t get_new_page, free_page_t put_new_page, + unsigned long private, struct page *page, + struct page **newpagep, int force, + enum migrate_mode mode, enum migrate_reason reason, + struct list_head *ret) { - int rc = MIGRATEPAGE_SUCCESS; + int rc = MIGRATEPAGE_UNMAP; struct page *newpage = NULL; if (!thp_migration_supported() && PageTransHuge(page)) @@ -1151,19 +1210,48 @@ static int unmap_and_move(new_page_t get_new_page, ClearPageActive(page); ClearPageUnevictable(page); /* free_pages_prepare() will clear PG_isolated. */ - goto out; + list_del(&page->lru); + migrate_page_done(page, reason); + return MIGRATEPAGE_SUCCESS; } newpage = get_new_page(page, private); if (!newpage) return -ENOMEM; + *newpagep = newpage; - newpage->private = 0; - rc = __unmap_and_move(page, newpage, force, mode); + rc = __migrate_page_unmap(page, newpage, force, mode); + if (rc == MIGRATEPAGE_UNMAP) + return rc; + + /* + * A page that has not been migrated will have kept its + * references and be restored. + */ + /* restore the page to right list. */ + if (rc != -EAGAIN) + list_move_tail(&page->lru, ret); + + if (put_new_page) + put_new_page(newpage, private); + else + put_page(newpage); + + return rc; +} + +/* Migrate the page to the newly allocated page in newpage. */ +static int migrate_page_move(free_page_t put_new_page, unsigned long private, + struct page *page, struct page *newpage, + enum migrate_mode mode, enum migrate_reason reason, + struct list_head *ret) +{ + int rc; + + rc = __migrate_page_move(page, newpage, mode); if (rc == MIGRATEPAGE_SUCCESS) set_page_owner_migrate_reason(newpage, reason); -out: if (rc != -EAGAIN) { /* * A page that has been migrated has all references @@ -1179,20 +1267,7 @@ static int unmap_and_move(new_page_t get_new_page, * we want to retry. */ if (rc == MIGRATEPAGE_SUCCESS) { - /* - * Compaction can migrate also non-LRU pages which are - * not accounted to NR_ISOLATED_*. They can be recognized - * as __PageMovable - */ - if (likely(!__PageMovable(page))) - mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON + - page_is_file_lru(page), -thp_nr_pages(page)); - - if (reason != MR_MEMORY_FAILURE) - /* - * We release the page in page_handle_poison. - */ - put_page(page); + migrate_page_done(page, reason); } else { if (rc != -EAGAIN) list_add_tail(&page->lru, ret); @@ -1405,6 +1480,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, int pass = 0; bool is_thp = false; struct page *page; + struct page *newpage = NULL; struct page *page2; int rc, nr_subpages; LIST_HEAD(ret_pages); @@ -1493,9 +1569,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, if (PageHuge(page)) continue; - rc = unmap_and_move(get_new_page, put_new_page, - private, page, pass > 2, mode, + rc = migrate_page_unmap(get_new_page, put_new_page, private, + page, &newpage, pass > 2, mode, reason, &ret_pages); + if (rc == MIGRATEPAGE_UNMAP) + rc = migrate_page_move(put_new_page, private, + page, newpage, mode, + reason, &ret_pages); /* * The rules are: * Success: page will be freed From patchwork Wed Sep 21 06:06:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 12983240 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F4052ECAAD8 for ; Wed, 21 Sep 2022 06:07:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7CB9A6B0075; Wed, 21 Sep 2022 02:07:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 754B7940008; Wed, 21 Sep 2022 02:07:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5CF2E940007; Wed, 21 Sep 2022 02:07:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4D9376B0075 for ; Wed, 21 Sep 2022 02:07:00 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2A2E9C06FE for ; Wed, 21 Sep 2022 06:07:00 +0000 (UTC) X-FDA: 79935059400.21.471875F Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf28.hostedemail.com (Postfix) with ESMTP id 879E6C000C for ; Wed, 21 Sep 2022 06:06:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663740419; x=1695276419; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vMOxlk+IH0y/jJD/DO08c91l+llx8vnqZ8FEqWsnej4=; b=m+YMZEJA4Et195RUlu9WpNWq7bHU9pVayHONR0AwDaZdFFvFVmRb/IJI DTn4Pm2PW5U4sIJK8TnRKpppEY0+j6DTtqLREUjUmhOdrsMVEyiPBTXGB 22QEzVdM3KuErVO37s/kayGCWEMUZgCQ9gutb6EOSWCmcKsAEgVqqE32E tpycN1djDqJsqsjVWhgTvJaZ3oHvhhJPmwlo50Xyu9E5Po4CzY6uvDLVm VfgCpFZQCVV8nEoXkNkPs0qcKA+UZcrgs4eMljX27i0BQfnlUtik0xAgQ BJOSbjNHyMWtUiag8utm8TAZJY9a5rPxUP8A7bL+WK0H/pWE3FxhYlo05 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10476"; a="282956841" X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="282956841" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:06:59 -0700 X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="649913874" Received: from yhuang6-mobl2.sh.intel.com ([10.238.5.245]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:06:56 -0700 From: Huang Ying To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Huang Ying , Zi Yan , Yang Shi , Baolin Wang , Oscar Salvador , Matthew Wilcox Subject: [RFC 3/6] mm/migrate_pages: restrict number of pages to migrate in batch Date: Wed, 21 Sep 2022 14:06:13 +0800 Message-Id: <20220921060616.73086-4-ying.huang@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220921060616.73086-1-ying.huang@intel.com> References: <20220921060616.73086-1-ying.huang@intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663740419; a=rsa-sha256; cv=none; b=AfDQvg7oz6UymHU1icNxOFHnBsQymy9tPukPqASn2/hYAZkvV5GEDSRdf6LH9MqV5uSjUe VWgpJjpTJ6Dv949NoCMJ25PUNI5LaLHDUkBO0/mUVSXQFHJhWeN54r7jX2FDZh7zojKJpw ZQVyxImi/dGLTlI/xFXETLwJtx61fnA= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=m+YMZEJA; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663740419; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6sI578zPO31SGuIvnfv9ruk/rXq1w1XVZD9mdppxNr0=; b=oG+FN19tDDZJEzyCRp3628MtOuCPXFNXWN7y+F5Kktsb//TG7g3ACsRlcct/Qj03igyVUN xmzFrOQVjgh7z+WZRJA9aESf1RwTdGPlh9+i1g8k7p2KkK4JSQxOHQyrrGjMVxCf3ZqvXv RdfXlLfv+1p+f8KEWNEOge/2K0f1yDk= X-Rspamd-Server: rspam06 X-Rspam-User: X-Stat-Signature: 6zggqsw95b17fuf75dhh8kmx8tpwbmra X-Rspamd-Queue-Id: 879E6C000C Authentication-Results: imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=m+YMZEJA; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-HE-Tag: 1663740419-734486 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a preparation patch to batch the page unmapping and moving for the normal pages and THP. If we had batched the page unmapping, all pages to be migrated would be unmapped before copying the contents and flags of the pages. If the number of pages that were passed to migrate_pages() was too large, too many pages would be unmapped. Then, the execution of their processes would be stopped for too long time. For example, migrate_pages() syscall will call migrate_pages() with all pages of a process. To avoid this possible issue, in this patch, we restrict the number of pages to be migrated to be no more than HPAGE_PMD_NR. That is, the influence is at the same level of THP migration. Signed-off-by: "Huang, Ying" Cc: Zi Yan Cc: Yang Shi Cc: Baolin Wang Cc: Oscar Salvador Cc: Matthew Wilcox --- mm/migrate.c | 93 +++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 67 insertions(+), 26 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 4a81e0bfdbcd..1077af858e36 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1439,32 +1439,7 @@ static inline int try_split_thp(struct page *page, struct list_head *split_pages return rc; } -/* - * migrate_pages - migrate the pages specified in a list, to the free pages - * supplied as the target for the page migration - * - * @from: The list of pages to be migrated. - * @get_new_page: The function used to allocate free pages to be used - * as the target of the page migration. - * @put_new_page: The function used to free target pages if migration - * fails, or NULL if no special handling is necessary. - * @private: Private data to be passed on to get_new_page() - * @mode: The migration mode that specifies the constraints for - * page migration, if any. - * @reason: The reason for page migration. - * @ret_succeeded: Set to the number of normal pages migrated successfully if - * the caller passes a non-NULL pointer. - * - * The function returns after 10 attempts or if no pages are movable any more - * because the list has become empty or no retryable pages exist any more. - * It is caller's responsibility to call putback_movable_pages() to return pages - * to the LRU or free list only if ret != 0. - * - * Returns the number of {normal page, THP, hugetlb} that were not migrated, or - * an error code. The number of THP splits will be considered as the number of - * non-migrated THP, no matter how many subpages of the THP are migrated successfully. - */ -int migrate_pages(struct list_head *from, new_page_t get_new_page, +static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, free_page_t put_new_page, unsigned long private, enum migrate_mode mode, int reason, unsigned int *ret_succeeded) { @@ -1709,6 +1684,72 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, return rc; } +/* + * migrate_pages - migrate the pages specified in a list, to the free pages + * supplied as the target for the page migration + * + * @from: The list of pages to be migrated. + * @get_new_page: The function used to allocate free pages to be used + * as the target of the page migration. + * @put_new_page: The function used to free target pages if migration + * fails, or NULL if no special handling is necessary. + * @private: Private data to be passed on to get_new_page() + * @mode: The migration mode that specifies the constraints for + * page migration, if any. + * @reason: The reason for page migration. + * @ret_succeeded: Set to the number of normal pages migrated successfully if + * the caller passes a non-NULL pointer. + * + * The function returns after 10 attempts or if no pages are movable any more + * because the list has become empty or no retryable pages exist any more. + * It is caller's responsibility to call putback_movable_pages() to return pages + * to the LRU or free list only if ret != 0. + * + * Returns the number of {normal page, THP, hugetlb} that were not migrated, or + * an error code. The number of THP splits will be considered as the number of + * non-migrated THP, no matter how many subpages of the THP are migrated successfully. + */ +int migrate_pages(struct list_head *from, new_page_t get_new_page, + free_page_t put_new_page, unsigned long private, + enum migrate_mode mode, int reason, unsigned int *pret_succeeded) +{ + int rc, rc_gether = 0; + int ret_succeeded, ret_succeeded_gether = 0; + int nr_pages; + struct page *page; + LIST_HEAD(pagelist); + LIST_HEAD(ret_pages); + +again: + nr_pages = 0; + list_for_each_entry(page, from, lru) { + nr_pages += compound_nr(page); + if (nr_pages > HPAGE_PMD_NR) + break; + } + if (nr_pages > HPAGE_PMD_NR) + list_cut_before(&pagelist, from, &page->lru); + else + list_splice_init(from, &pagelist); + rc = migrate_pages_batch(&pagelist, get_new_page, put_new_page, private, + mode, reason, &ret_succeeded); + ret_succeeded_gether += ret_succeeded; + list_splice_tail_init(&pagelist, &ret_pages); + if (rc == -ENOMEM) { + rc_gether = rc; + goto out; + } + rc_gether += rc; + if (!list_empty(from)) + goto again; +out: + if (pret_succeeded) + *pret_succeeded = ret_succeeded_gether; + list_splice(&ret_pages, from); + + return rc_gether; +} + struct page *alloc_migration_target(struct page *page, unsigned long private) { struct folio *folio = page_folio(page); From patchwork Wed Sep 21 06:06:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 12983241 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA5D5C6FA82 for ; Wed, 21 Sep 2022 06:07:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 61D26940008; Wed, 21 Sep 2022 02:07:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5CCAD940007; Wed, 21 Sep 2022 02:07:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 447AC940008; Wed, 21 Sep 2022 02:07:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 36FF4940007 for ; Wed, 21 Sep 2022 02:07:03 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0BCD3120E41 for ; Wed, 21 Sep 2022 06:07:03 +0000 (UTC) X-FDA: 79935059526.24.EFB50F5 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf28.hostedemail.com (Postfix) with ESMTP id 0E8C9C000B for ; Wed, 21 Sep 2022 06:07:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663740422; x=1695276422; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=86Q547FC+4GoVik4Kb6Pl8hTy/8I3tx19/Q4rOJo4M4=; b=VYsHAbO7/NGDkd9ozPRKuoYpNxyP1IsgWf+nb62rpFB7IT0an4gIrI14 Xr/2Yp9DbrLQOlujy/8+C6MmZ7gJyzqVQjm3/Wiols9mNBEVbiMfVxMax tHZHR1FdHgNhL+/rSr10nto6qSa6+hc2k/oGGoAthu+pWYM8zPXpCGCe0 6OpS+TtvWapqBPBlWrlSv68vkBmZEQZUXylJP8Vj7S/ZeBtN7JopXuFCN qEh1hu/enW2tGOTE5GzY6QTBR4KhT4hcBtAKWdFOU4WywAlYkGs7E0loi +1qS6hCPRQa6wMdp++PkgQPTFD3DOXX6hGxzk92/BZ4IOY1Uvmk3VES0e A==; X-IronPort-AV: E=McAfee;i="6500,9779,10476"; a="282956856" X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="282956856" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:07:01 -0700 X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="649913920" Received: from yhuang6-mobl2.sh.intel.com ([10.238.5.245]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:06:59 -0700 From: Huang Ying To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Huang Ying , Zi Yan , Yang Shi , Baolin Wang , Oscar Salvador , Matthew Wilcox Subject: [RFC 4/6] mm/migrate_pages: batch _unmap and _move Date: Wed, 21 Sep 2022 14:06:14 +0800 Message-Id: <20220921060616.73086-5-ying.huang@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220921060616.73086-1-ying.huang@intel.com> References: <20220921060616.73086-1-ying.huang@intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663740422; a=rsa-sha256; cv=none; b=CZo8aiaXQ/h2JP8Kd84jefydobSqAP7GthWHFcMq5ANRCR/S/wYUv9CmCl10+S3f/zm0ju FllGUIbTVqgsjxkBloXkfiIf7ZL3JwYwJTchberoknhBSMUtW+Dy6IgKY7je9/fm2pCDeB RqWKWE0Ghod7+hcEL6sP2qDtQ6dUCns= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=VYsHAbO7; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663740422; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8EuXGVOuzdKHfr3zOSL5ChTLd8xOxfeA69Ko7U7T7YM=; b=DxbbPz4eJtx4AZqBaFE01zHmCWosRkixx54mp1tyKFh+iln0hzDRhQSCoh9if6qfTSSvLy piVgiv+4m/4x+nEs/X+DqBfqZejvmkr3cOrXX/A0xV5QrWou3NA5bSMZI+08+IdZAWcQeX BSA8Kb/WGezurMdzH3hcsxH/sM7H7Aw= X-Rspamd-Server: rspam06 X-Rspam-User: X-Stat-Signature: 9mnkpt66397x78itoto18ctjm691i1px X-Rspamd-Queue-Id: 0E8C9C000B Authentication-Results: imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=VYsHAbO7; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-HE-Tag: 1663740421-7740 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In this patch the _unmap and _move stage of the page migration is batched. That for, previously, it is, for each page _unmap() _move() Now, it is, for each page _unmap() for each page _move() Based on this, we can batch the TLB flushing and use some hardware accelerator to copy pages between batched _unmap and batched _move stages. Signed-off-by: "Huang, Ying" Cc: Zi Yan Cc: Yang Shi Cc: Baolin Wang Cc: Oscar Salvador Cc: Matthew Wilcox --- mm/migrate.c | 155 +++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 139 insertions(+), 16 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 1077af858e36..165cbbc834e2 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -996,6 +996,32 @@ static void __migrate_page_extract(struct page *newpage, #define MIGRATEPAGE_UNMAP 1 +static void migrate_page_undo_page(struct page *page, + int page_was_mapped, + struct anon_vma *anon_vma, + struct list_head *ret) +{ + struct folio *folio = page_folio(page); + + if (page_was_mapped) + remove_migration_ptes(folio, folio, false); + if (anon_vma) + put_anon_vma(anon_vma); + unlock_page(page); + list_move_tail(&page->lru, ret); +} + +static void migrate_page_undo_newpage(struct page *newpage, + free_page_t put_new_page, + unsigned long private) +{ + unlock_page(newpage); + if (put_new_page) + put_new_page(newpage, private); + else + put_page(newpage); +} + static int __migrate_page_unmap(struct page *page, struct page *newpage, int force, enum migrate_mode mode) { @@ -1140,6 +1166,8 @@ static int __migrate_page_move(struct page *page, struct page *newpage, rc = move_to_new_folio(dst, folio, mode); + if (rc != -EAGAIN) + list_del(&newpage->lru); /* * When successful, push newpage to LRU immediately: so that if it * turns out to be an mlocked page, remove_migration_ptes() will @@ -1155,6 +1183,11 @@ static int __migrate_page_move(struct page *page, struct page *newpage, lru_add_drain(); } + if (rc == -EAGAIN) { + __migrate_page_record(newpage, page_was_mapped, anon_vma); + return rc; + } + if (page_was_mapped) remove_migration_ptes(folio, rc == MIGRATEPAGE_SUCCESS ? dst : folio, false); @@ -1220,6 +1253,7 @@ static int migrate_page_unmap(new_page_t get_new_page, free_page_t put_new_page, return -ENOMEM; *newpagep = newpage; + newpage->private = 0; rc = __migrate_page_unmap(page, newpage, force, mode); if (rc == MIGRATEPAGE_UNMAP) return rc; @@ -1258,7 +1292,7 @@ static int migrate_page_move(free_page_t put_new_page, unsigned long private, * removed and will be freed. A page that has not been * migrated will have kept its references and be restored. */ - list_del(&page->lru); + list_del_init(&page->lru); } /* @@ -1268,9 +1302,8 @@ static int migrate_page_move(free_page_t put_new_page, unsigned long private, */ if (rc == MIGRATEPAGE_SUCCESS) { migrate_page_done(page, reason); - } else { - if (rc != -EAGAIN) - list_add_tail(&page->lru, ret); + } else if (rc != -EAGAIN) { + list_add_tail(&page->lru, ret); if (put_new_page) put_new_page(newpage, private); @@ -1455,11 +1488,13 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, int pass = 0; bool is_thp = false; struct page *page; - struct page *newpage = NULL; + struct page *newpage = NULL, *newpage2; struct page *page2; int rc, nr_subpages; LIST_HEAD(ret_pages); LIST_HEAD(thp_split_pages); + LIST_HEAD(unmap_pages); + LIST_HEAD(new_pages); bool nosplit = (reason == MR_NUMA_MISPLACED); bool no_subpage_counting = false; @@ -1541,19 +1576,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, nr_subpages = compound_nr(page); cond_resched(); - if (PageHuge(page)) + if (PageHuge(page)) { + list_move_tail(&page->lru, &ret_pages); continue; + } rc = migrate_page_unmap(get_new_page, put_new_page, private, page, &newpage, pass > 2, mode, reason, &ret_pages); - if (rc == MIGRATEPAGE_UNMAP) - rc = migrate_page_move(put_new_page, private, - page, newpage, mode, - reason, &ret_pages); /* * The rules are: * Success: page will be freed + * Unmap: page will be put on unmap_pages list, + * new page put on new_pages list * -EAGAIN: stay on the from list * -ENOMEM: stay on the from list * -ENOSYS: stay on the from list @@ -1589,7 +1624,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, case -ENOMEM: /* * When memory is low, don't bother to try to migrate - * other pages, just exit. + * other pages, move unmapped pages, then exit. */ if (is_thp) { nr_thp_failed++; @@ -1610,9 +1645,11 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, * the caller otherwise the page refcnt will be leaked. */ list_splice_init(&thp_split_pages, from); - /* nr_failed isn't updated for not used */ nr_thp_failed += thp_retry; - goto out; + if (list_empty(&unmap_pages)) + goto out; + else + goto move; case -EAGAIN: if (is_thp) thp_retry++; @@ -1625,6 +1662,10 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, if (is_thp) nr_thp_succeeded++; break; + case MIGRATEPAGE_UNMAP: + list_move_tail(&page->lru, &unmap_pages); + list_add_tail(&newpage->lru, &new_pages); + break; default: /* * Permanent failure (-EBUSY, etc.): @@ -1645,12 +1686,96 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, nr_failed += retry; nr_thp_failed += thp_retry; nr_failed_pages += nr_retry_pages; +move: + retry = 1; + thp_retry = 1; + for (pass = 0; pass < 10 && (retry || thp_retry); pass++) { + retry = 0; + thp_retry = 0; + nr_retry_pages = 0; + + newpage = list_first_entry(&new_pages, struct page, lru); + newpage2 = list_next_entry(newpage, lru); + list_for_each_entry_safe(page, page2, &unmap_pages, lru) { + /* + * THP statistics is based on the source huge page. + * Capture required information that might get lost + * during migration. + */ + is_thp = PageTransHuge(page) && !PageHuge(page); + nr_subpages = compound_nr(page); + cond_resched(); + + rc = migrate_page_move(put_new_page, private, + page, newpage, mode, + reason, &ret_pages); + /* + * The rules are: + * Success: page will be freed + * -EAGAIN: stay on the unmap_pages list + * Other errno: put on ret_pages list then splice to + * from list + */ + switch(rc) { + case -EAGAIN: + if (is_thp) + thp_retry++; + else if (!no_subpage_counting) + retry++; + nr_retry_pages += nr_subpages; + break; + case MIGRATEPAGE_SUCCESS: + nr_succeeded += nr_subpages; + if (is_thp) + nr_thp_succeeded++; + break; + default: + /* + * Permanent failure (-EBUSY, etc.): + * unlike -EAGAIN case, the failed page is + * removed from migration page list and not + * retried in the next outer loop. + */ + if (is_thp) + nr_thp_failed++; + else if (!no_subpage_counting) + nr_failed++; + + nr_failed_pages += nr_subpages; + break; + } + newpage = newpage2; + newpage2 = list_next_entry(newpage, lru); + } + } + nr_failed += retry; + nr_thp_failed += thp_retry; + nr_failed_pages += nr_retry_pages; + + rc = nr_failed + nr_thp_failed; +out: + /* Cleanup remaining pages */ + newpage = list_first_entry(&new_pages, struct page, lru); + newpage2 = list_next_entry(newpage, lru); + list_for_each_entry_safe(page, page2, &unmap_pages, lru) { + int page_was_mapped = 0; + struct anon_vma *anon_vma = NULL; + + __migrate_page_extract(newpage, &page_was_mapped, &anon_vma); + migrate_page_undo_page(page, page_was_mapped, anon_vma, + &ret_pages); + list_del(&newpage->lru); + migrate_page_undo_newpage(newpage, put_new_page, private); + newpage = newpage2; + newpage2 = list_next_entry(newpage, lru); + } + /* * Try to migrate subpages of fail-to-migrate THPs, no nr_failed * counting in this round, since all subpages of a THP is counted * as 1 failure in the first round. */ - if (!list_empty(&thp_split_pages)) { + if (rc >= 0 && !list_empty(&thp_split_pages)) { /* * Move non-migrated pages (after 10 retries) to ret_pages * to avoid migrating them again. @@ -1662,8 +1787,6 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, goto thp_subpage_migration; } - rc = nr_failed + nr_thp_failed; -out: /* * Put the permanent failure page back to migration list, they * will be put back to the right list by the caller. From patchwork Wed Sep 21 06:06:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 12983242 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18F7DECAAD8 for ; Wed, 21 Sep 2022 06:07:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A8A82940009; Wed, 21 Sep 2022 02:07:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A3ACE940007; Wed, 21 Sep 2022 02:07:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88F23940009; Wed, 21 Sep 2022 02:07:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7B698940007 for ; Wed, 21 Sep 2022 02:07:05 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 3F1B21C5C21 for ; Wed, 21 Sep 2022 06:07:05 +0000 (UTC) X-FDA: 79935059610.29.829F91D Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf28.hostedemail.com (Postfix) with ESMTP id 83BD1C000F for ; Wed, 21 Sep 2022 06:07:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663740424; x=1695276424; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/7kWcvtAL/CcEdmW0newy/EIa1m6JlYj2jWMzPR9NMs=; b=Ng7R5O6gJz+tPkJHa7/AWRPzlHjizlhaNg7bb1AeJQE0CLLThK7iAV3x JA7Kc7LJmEhUJN2B0NoLeSTHG0J9XNFGLYV+sEHPNYM7n/fjpNqJTMtZN vynkDlQ3Nqn+A1T+OZLKxBBigosqZQ8Sp/aGsZndqWQNvKAcD2r7Csxi3 hDj/QDhsUQvin8qAMlclWEa6EvFIT88TR3QQ37PkWT3Yye/Xs8YaCyv8d H5kZAw76ipn+TNquZLaV0Lpy6cgzQzdRggU1UynyoAl+09kOl1z8DurvA l1jfGSpZ2+0haJbH413hUji1I8xARQvyDf8pGuf4Koxtk1cCMMIcYOClH g==; X-IronPort-AV: E=McAfee;i="6500,9779,10476"; a="282956858" X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="282956858" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:07:04 -0700 X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="649913954" Received: from yhuang6-mobl2.sh.intel.com ([10.238.5.245]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:07:01 -0700 From: Huang Ying To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Huang Ying , Zi Yan , Yang Shi , Baolin Wang , Oscar Salvador , Matthew Wilcox Subject: [RFC 5/6] mm/migrate_pages: share more code between _unmap and _move Date: Wed, 21 Sep 2022 14:06:15 +0800 Message-Id: <20220921060616.73086-6-ying.huang@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220921060616.73086-1-ying.huang@intel.com> References: <20220921060616.73086-1-ying.huang@intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663740424; a=rsa-sha256; cv=none; b=8U1S1Tn9osPnCyTj+5/fx5h7LPt6/JShO6V22LXwObgtI+KABNrkFdJ76/jxb7fQn1qRed jxow0qmF42152ZHEZXgisRAd/FnVboG1kHtH06sf7N5afRlmVRkJSGfkl5wC5IIEtKQ3m3 7qQlRybfMGSgJ5zeaJqEO8vzbeQAdm4= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=Ng7R5O6g; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663740424; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eQyFkYHJbG5EQSPlY7Hnf3+gCFwV5GHBg7K+f5vEopI=; b=QO74o8ialT5GA7KiA/TwPZnWOgutja77QOQJpyqJKqg04/Zq4iomYRquKQctRtxD1DdXCv BfhTbC3gsjAUjgBK+rJoB/sdt2wL4RO9fBh93PuIGjZ4H4iM9Gz3cYV3gKjBn6YS+KlEnt AkGQaqZGHhvmrc6UR8OvUvb8jcJTMpA= X-Rspamd-Server: rspam06 X-Rspam-User: X-Stat-Signature: x7dmofch3jss6sciui1qd4uxcrdyxnn3 X-Rspamd-Queue-Id: 83BD1C000F Authentication-Results: imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=Ng7R5O6g; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-HE-Tag: 1663740424-106311 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a code cleanup patch to reduce the duplicated code between the _unmap and _move stages of migrate_pages(). No functionality change is expected. Signed-off-by: "Huang, Ying" Cc: Zi Yan Cc: Yang Shi Cc: Baolin Wang Cc: Oscar Salvador Cc: Matthew Wilcox --- mm/migrate.c | 240 +++++++++++++++++++++------------------------------ 1 file changed, 100 insertions(+), 140 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 165cbbc834e2..042fa147f302 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -999,6 +999,7 @@ static void __migrate_page_extract(struct page *newpage, static void migrate_page_undo_page(struct page *page, int page_was_mapped, struct anon_vma *anon_vma, + bool locked, struct list_head *ret) { struct folio *folio = page_folio(page); @@ -1007,30 +1008,77 @@ static void migrate_page_undo_page(struct page *page, remove_migration_ptes(folio, folio, false); if (anon_vma) put_anon_vma(anon_vma); - unlock_page(page); - list_move_tail(&page->lru, ret); + if (locked) + unlock_page(page); + if (ret) + list_move_tail(&page->lru, ret); } static void migrate_page_undo_newpage(struct page *newpage, + bool locked, free_page_t put_new_page, unsigned long private) { - unlock_page(newpage); + if (locked) + unlock_page(newpage); if (put_new_page) put_new_page(newpage, private); else put_page(newpage); } -static int __migrate_page_unmap(struct page *page, struct page *newpage, - int force, enum migrate_mode mode) +static void migrate_page_done(struct page *page, + enum migrate_reason reason) +{ + /* + * Compaction can migrate also non-LRU pages which are + * not accounted to NR_ISOLATED_*. They can be recognized + * as __PageMovable + */ + if (likely(!__PageMovable(page))) + mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON + + page_is_file_lru(page), -thp_nr_pages(page)); + + if (reason != MR_MEMORY_FAILURE) + /* We release the page in page_handle_poison. */ + put_page(page); +} + +/* Obtain the lock on page, remove all ptes. */ +static int migrate_page_unmap(new_page_t get_new_page, free_page_t put_new_page, + unsigned long private, struct page *page, + struct page **newpagep, int force, + enum migrate_mode mode, enum migrate_reason reason, + struct list_head *ret) { struct folio *folio = page_folio(page); - int rc = -EAGAIN; + int rc = MIGRATEPAGE_UNMAP; + struct page *newpage = NULL; int page_was_mapped = 0; struct anon_vma *anon_vma = NULL; bool is_lru = !__PageMovable(page); + bool locked = false; + bool newpage_locked = false; + + if (!thp_migration_supported() && PageTransHuge(page)) + return -ENOSYS; + if (page_count(page) == 1) { + /* Page was freed from under us. So we are done. */ + ClearPageActive(page); + ClearPageUnevictable(page); + /* free_pages_prepare() will clear PG_isolated. */ + list_del(&page->lru); + migrate_page_done(page, reason); + return MIGRATEPAGE_SUCCESS; + } + + newpage = get_new_page(page, private); + if (!newpage) + return -ENOMEM; + *newpagep = newpage; + + rc = -EAGAIN; if (!trylock_page(page)) { if (!force || mode == MIGRATE_ASYNC) goto out; @@ -1053,6 +1101,7 @@ static int __migrate_page_unmap(struct page *page, struct page *newpage, lock_page(page); } + locked = true; if (PageWriteback(page)) { /* @@ -1067,10 +1116,10 @@ static int __migrate_page_unmap(struct page *page, struct page *newpage, break; default: rc = -EBUSY; - goto out_unlock; + goto out; } if (!force) - goto out_unlock; + goto out; wait_on_page_writeback(page); } @@ -1100,7 +1149,8 @@ static int __migrate_page_unmap(struct page *page, struct page *newpage, * This is much like races on refcount of oldpage: just don't BUG(). */ if (unlikely(!trylock_page(newpage))) - goto out_unlock; + goto out; + newpage_locked = true; if (unlikely(!is_lru)) { __migrate_page_record(newpage, page_was_mapped, anon_vma); @@ -1123,7 +1173,7 @@ static int __migrate_page_unmap(struct page *page, struct page *newpage, VM_BUG_ON_PAGE(PageAnon(page), page); if (page_has_private(page)) { try_to_free_buffers(folio); - goto out_unlock_both; + goto out; } } else if (page_mapped(page)) { /* Establish migration ptes */ @@ -1141,20 +1191,28 @@ static int __migrate_page_unmap(struct page *page, struct page *newpage, if (page_was_mapped) remove_migration_ptes(folio, folio, false); -out_unlock_both: - unlock_page(newpage); -out_unlock: - /* Drop an anon_vma reference if we took one */ - if (anon_vma) - put_anon_vma(anon_vma); - unlock_page(page); out: + /* + * A page that has not been migrated will have kept its + * references and be restored. + */ + /* restore the page to right list. */ + if (rc != -EAGAIN) + ret = NULL; + + migrate_page_undo_page(page, page_was_mapped, anon_vma, locked, ret); + if (newpage) + migrate_page_undo_newpage(newpage, newpage_locked, + put_new_page, private); return rc; } -static int __migrate_page_move(struct page *page, struct page *newpage, - enum migrate_mode mode) +/* Migrate the page to the newly allocated page in newpage. */ +static int migrate_page_move(free_page_t put_new_page, unsigned long private, + struct page *page, struct page *newpage, + enum migrate_mode mode, enum migrate_reason reason, + struct list_head *ret) { struct folio *folio = page_folio(page); struct folio *dst = page_folio(newpage); @@ -1165,9 +1223,10 @@ static int __migrate_page_move(struct page *page, struct page *newpage, __migrate_page_extract(newpage, &page_was_mapped, &anon_vma); rc = move_to_new_folio(dst, folio, mode); + if (rc) + goto out; - if (rc != -EAGAIN) - list_del(&newpage->lru); + list_del(&newpage->lru); /* * When successful, push newpage to LRU immediately: so that if it * turns out to be an mlocked page, remove_migration_ptes() will @@ -1177,139 +1236,40 @@ static int __migrate_page_move(struct page *page, struct page *newpage, * unsuccessful, and other cases when a page has been temporarily * isolated from the unevictable LRU: but this case is the easiest. */ - if (rc == MIGRATEPAGE_SUCCESS) { - lru_cache_add(newpage); - if (page_was_mapped) - lru_add_drain(); - } - - if (rc == -EAGAIN) { - __migrate_page_record(newpage, page_was_mapped, anon_vma); - return rc; - } - + lru_cache_add(newpage); if (page_was_mapped) - remove_migration_ptes(folio, - rc == MIGRATEPAGE_SUCCESS ? dst : folio, false); + lru_add_drain(); + if (page_was_mapped) + remove_migration_ptes(folio, dst, false); unlock_page(newpage); - /* Drop an anon_vma reference if we took one */ - if (anon_vma) - put_anon_vma(anon_vma); - unlock_page(page); + set_page_owner_migrate_reason(newpage, reason); /* * If migration is successful, decrease refcount of the newpage, * which will not free the page because new page owner increased * refcounter. */ - if (rc == MIGRATEPAGE_SUCCESS) - put_page(newpage); - - return rc; -} + put_page(newpage); -static void migrate_page_done(struct page *page, - enum migrate_reason reason) -{ /* - * Compaction can migrate also non-LRU pages which are - * not accounted to NR_ISOLATED_*. They can be recognized - * as __PageMovable + * A page that has been migrated has all references removed + * and will be freed. */ - if (likely(!__PageMovable(page))) - mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON + - page_is_file_lru(page), -thp_nr_pages(page)); - - if (reason != MR_MEMORY_FAILURE) - /* We release the page in page_handle_poison. */ - put_page(page); -} - -/* Obtain the lock on page, remove all ptes. */ -static int migrate_page_unmap(new_page_t get_new_page, free_page_t put_new_page, - unsigned long private, struct page *page, - struct page **newpagep, int force, - enum migrate_mode mode, enum migrate_reason reason, - struct list_head *ret) -{ - int rc = MIGRATEPAGE_UNMAP; - struct page *newpage = NULL; - - if (!thp_migration_supported() && PageTransHuge(page)) - return -ENOSYS; - - if (page_count(page) == 1) { - /* Page was freed from under us. So we are done. */ - ClearPageActive(page); - ClearPageUnevictable(page); - /* free_pages_prepare() will clear PG_isolated. */ - list_del(&page->lru); - migrate_page_done(page, reason); - return MIGRATEPAGE_SUCCESS; - } - - newpage = get_new_page(page, private); - if (!newpage) - return -ENOMEM; - *newpagep = newpage; - - newpage->private = 0; - rc = __migrate_page_unmap(page, newpage, force, mode); - if (rc == MIGRATEPAGE_UNMAP) - return rc; - - /* - * A page that has not been migrated will have kept its - * references and be restored. - */ - /* restore the page to right list. */ - if (rc != -EAGAIN) - list_move_tail(&page->lru, ret); - - if (put_new_page) - put_new_page(newpage, private); - else - put_page(newpage); + list_del(&page->lru); + migrate_page_undo_page(page, 0, anon_vma, true, NULL); + migrate_page_done(page, reason); return rc; -} -/* Migrate the page to the newly allocated page in newpage. */ -static int migrate_page_move(free_page_t put_new_page, unsigned long private, - struct page *page, struct page *newpage, - enum migrate_mode mode, enum migrate_reason reason, - struct list_head *ret) -{ - int rc; - - rc = __migrate_page_move(page, newpage, mode); - if (rc == MIGRATEPAGE_SUCCESS) - set_page_owner_migrate_reason(newpage, reason); - - if (rc != -EAGAIN) { - /* - * A page that has been migrated has all references - * removed and will be freed. A page that has not been - * migrated will have kept its references and be restored. - */ - list_del_init(&page->lru); +out: + if (rc == -EAGAIN) { + __migrate_page_record(newpage, page_was_mapped, anon_vma); + return rc; } - /* - * If migration is successful, releases reference grabbed during - * isolation. Otherwise, restore the page to right list unless - * we want to retry. - */ - if (rc == MIGRATEPAGE_SUCCESS) { - migrate_page_done(page, reason); - } else if (rc != -EAGAIN) { - list_add_tail(&page->lru, ret); - - if (put_new_page) - put_new_page(newpage, private); - else - put_page(newpage); - } + migrate_page_undo_page(page, page_was_mapped, anon_vma, true, ret); + list_del(&newpage->lru); + migrate_page_undo_newpage(newpage, true, put_new_page, private); return rc; } @@ -1763,9 +1723,9 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, __migrate_page_extract(newpage, &page_was_mapped, &anon_vma); migrate_page_undo_page(page, page_was_mapped, anon_vma, - &ret_pages); + true, &ret_pages); list_del(&newpage->lru); - migrate_page_undo_newpage(newpage, put_new_page, private); + migrate_page_undo_newpage(newpage, true, put_new_page, private); newpage = newpage2; newpage2 = list_next_entry(newpage, lru); } From patchwork Wed Sep 21 06:06:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 12983243 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA6D4C32771 for ; Wed, 21 Sep 2022 06:07:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 18A5794000A; Wed, 21 Sep 2022 02:07:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 11279940007; Wed, 21 Sep 2022 02:07:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F1C8E94000A; Wed, 21 Sep 2022 02:07:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E4A79940007 for ; Wed, 21 Sep 2022 02:07:08 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B6601C0785 for ; Wed, 21 Sep 2022 06:07:08 +0000 (UTC) X-FDA: 79935059736.03.23305A3 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf28.hostedemail.com (Postfix) with ESMTP id 09CE5C000F for ; Wed, 21 Sep 2022 06:07:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663740427; x=1695276427; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bnpB1LSVTakor8hls6T/vcEzoakaO2S/bSiuGKHVfgY=; b=lpQySTYSBH9vbYYz5PG1UXuM/+k3qWhBRVrE9r59kVDNLJ8I5/tjPXz+ leH7IrwZWKNtoq07lURdHgBNtsoqUNO6VhwOZzkH4i5/Bh54cV82Uaqye 1sFdnbFHaB2dOpmwPiLffw4Tcix7NyDjiqFLdCwoxlL5O3CoO9BvVTTvw OY6q0UwLRkUSCBHAtkkkWDHeHdq9Y+7aBgQd9zP0yVoRPzWWBX23M2ymS sldslQiyieK9RchzTRJVlkr+K9NexgrE44UxbLfeNyHUlf9QKbpTZ8cMx lQ+kv2nrNH0CQV7FZ3514i1oxidNuzhoJOq7hHFuTANY62l9Dm0/aYljQ w==; X-IronPort-AV: E=McAfee;i="6500,9779,10476"; a="282956862" X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="282956862" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:07:06 -0700 X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="649913962" Received: from yhuang6-mobl2.sh.intel.com ([10.238.5.245]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:07:04 -0700 From: Huang Ying To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Huang Ying , Zi Yan , Yang Shi , Baolin Wang , Oscar Salvador , Matthew Wilcox Subject: [RFC 6/6] mm/migrate_pages: batch flushing TLB Date: Wed, 21 Sep 2022 14:06:16 +0800 Message-Id: <20220921060616.73086-7-ying.huang@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220921060616.73086-1-ying.huang@intel.com> References: <20220921060616.73086-1-ying.huang@intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663740427; a=rsa-sha256; cv=none; b=n2BYiYISNqIpE9PAim0i4laxwMxkK+LylHfId5noh9nu57hWTC9ShKxIOFdXn0tvPL3Q9s Ehcjkjto37LebqnGt6Bget+6PEPKr76TKWSlktkzoIDrbkW729WbY95HMZhK7UyAu3XYRT J1n0Bdq7hbnDkJcCxq0QXGbJDbappME= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=lpQySTYS; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663740427; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ICRY6/18j9Dgg+RWFmo+0WTSvBE9U28jYAttP4wCyD4=; b=v/1AB2Dn7Ecgk9djGcQ+EG/csdnUSn+X5POmd8RC12yKqgeZ0jfqfDhgRYNn7NX5WR9DGh oyprG9W+tVeIEbACaZWSM/qkCbNjhsk9inlsaB1yTqo/XoSwFEw368whkgBFf4djBL0bEn vwCpeQBaq8bPalJ/4VSQbka3CbECk1s= X-Rspamd-Server: rspam06 X-Rspam-User: X-Stat-Signature: fyrxtxedq9g5xyioujwc1pdsx11ua8zg X-Rspamd-Queue-Id: 09CE5C000F Authentication-Results: imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=lpQySTYS; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-HE-Tag: 1663740426-58743 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The TLB flushing will cost quite some CPU cycles during the page migration in some situations. For example, when migrate a page of a process with multiple active threads that run on multiple CPUs. After batching the _unmap and _move in migrate_pages(), the TLB flushing can be batched easily with the existing TLB flush batching mechanism. This patch implements that. We use the following test case to test the patch. On a 2-socket Intel server, - Run pmbench memory accessing benchmark - Run `migratepages` to migrate pages of pmbench between node 0 and node 1 back and forth. With the patch, the TLB flushing IPI reduces 99.1% during the test and the number of pages migrated successfully per second increases 291.7%. Signed-off-by: "Huang, Ying" Cc: Zi Yan Cc: Yang Shi Cc: Baolin Wang Cc: Oscar Salvador Cc: Matthew Wilcox --- mm/migrate.c | 4 +++- mm/rmap.c | 24 ++++++++++++++++++++---- 2 files changed, 23 insertions(+), 5 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 042fa147f302..a0de0d9b4d41 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1179,7 +1179,7 @@ static int migrate_page_unmap(new_page_t get_new_page, free_page_t put_new_page, /* Establish migration ptes */ VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma, page); - try_to_migrate(folio, 0); + try_to_migrate(folio, TTU_BATCH_FLUSH); page_was_mapped = 1; } @@ -1647,6 +1647,8 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, nr_thp_failed += thp_retry; nr_failed_pages += nr_retry_pages; move: + try_to_unmap_flush(); + retry = 1; thp_retry = 1; for (pass = 0; pass < 10 && (retry || thp_retry); pass++) { diff --git a/mm/rmap.c b/mm/rmap.c index 93d5a6f793d2..ab88136720dc 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1960,8 +1960,24 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); } else { flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); - /* Nuke the page table entry. */ - pteval = ptep_clear_flush(vma, address, pvmw.pte); + /* + * Nuke the page table entry. + */ + if (should_defer_flush(mm, flags)) { + /* + * We clear the PTE but do not flush so potentially + * a remote CPU could still be writing to the folio. + * If the entry was previously clean then the + * architecture must guarantee that a clear->dirty + * transition on a cached TLB entry is written through + * and traps if the PTE is unmapped. + */ + pteval = ptep_get_and_clear(mm, address, pvmw.pte); + + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); + } else { + pteval = ptep_clear_flush(vma, address, pvmw.pte); + } } /* Set the dirty flag on the folio now the pte is gone. */ @@ -2128,10 +2144,10 @@ void try_to_migrate(struct folio *folio, enum ttu_flags flags) /* * Migration always ignores mlock and only supports TTU_RMAP_LOCKED and - * TTU_SPLIT_HUGE_PMD and TTU_SYNC flags. + * TTU_SPLIT_HUGE_PMD, TTU_SYNC and TTU_BATCH_FLUSH flags. */ if (WARN_ON_ONCE(flags & ~(TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD | - TTU_SYNC))) + TTU_SYNC | TTU_BATCH_FLUSH))) return; if (folio_is_zone_device(folio) &&