From patchwork Wed Sep 21 06:06:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 12983241 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA5D5C6FA82 for ; Wed, 21 Sep 2022 06:07:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 61D26940008; Wed, 21 Sep 2022 02:07:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5CCAD940007; Wed, 21 Sep 2022 02:07:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 447AC940008; Wed, 21 Sep 2022 02:07:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 36FF4940007 for ; Wed, 21 Sep 2022 02:07:03 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0BCD3120E41 for ; Wed, 21 Sep 2022 06:07:03 +0000 (UTC) X-FDA: 79935059526.24.EFB50F5 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf28.hostedemail.com (Postfix) with ESMTP id 0E8C9C000B for ; Wed, 21 Sep 2022 06:07:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663740422; x=1695276422; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=86Q547FC+4GoVik4Kb6Pl8hTy/8I3tx19/Q4rOJo4M4=; b=VYsHAbO7/NGDkd9ozPRKuoYpNxyP1IsgWf+nb62rpFB7IT0an4gIrI14 Xr/2Yp9DbrLQOlujy/8+C6MmZ7gJyzqVQjm3/Wiols9mNBEVbiMfVxMax tHZHR1FdHgNhL+/rSr10nto6qSa6+hc2k/oGGoAthu+pWYM8zPXpCGCe0 6OpS+TtvWapqBPBlWrlSv68vkBmZEQZUXylJP8Vj7S/ZeBtN7JopXuFCN qEh1hu/enW2tGOTE5GzY6QTBR4KhT4hcBtAKWdFOU4WywAlYkGs7E0loi +1qS6hCPRQa6wMdp++PkgQPTFD3DOXX6hGxzk92/BZ4IOY1Uvmk3VES0e A==; X-IronPort-AV: E=McAfee;i="6500,9779,10476"; a="282956856" X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="282956856" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:07:01 -0700 X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="649913920" Received: from yhuang6-mobl2.sh.intel.com ([10.238.5.245]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:06:59 -0700 From: Huang Ying To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Huang Ying , Zi Yan , Yang Shi , Baolin Wang , Oscar Salvador , Matthew Wilcox Subject: [RFC 4/6] mm/migrate_pages: batch _unmap and _move Date: Wed, 21 Sep 2022 14:06:14 +0800 Message-Id: <20220921060616.73086-5-ying.huang@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220921060616.73086-1-ying.huang@intel.com> References: <20220921060616.73086-1-ying.huang@intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663740422; a=rsa-sha256; cv=none; b=CZo8aiaXQ/h2JP8Kd84jefydobSqAP7GthWHFcMq5ANRCR/S/wYUv9CmCl10+S3f/zm0ju FllGUIbTVqgsjxkBloXkfiIf7ZL3JwYwJTchberoknhBSMUtW+Dy6IgKY7je9/fm2pCDeB RqWKWE0Ghod7+hcEL6sP2qDtQ6dUCns= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=VYsHAbO7; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663740422; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8EuXGVOuzdKHfr3zOSL5ChTLd8xOxfeA69Ko7U7T7YM=; b=DxbbPz4eJtx4AZqBaFE01zHmCWosRkixx54mp1tyKFh+iln0hzDRhQSCoh9if6qfTSSvLy piVgiv+4m/4x+nEs/X+DqBfqZejvmkr3cOrXX/A0xV5QrWou3NA5bSMZI+08+IdZAWcQeX BSA8Kb/WGezurMdzH3hcsxH/sM7H7Aw= X-Rspamd-Server: rspam06 X-Rspam-User: X-Stat-Signature: 9mnkpt66397x78itoto18ctjm691i1px X-Rspamd-Queue-Id: 0E8C9C000B Authentication-Results: imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=VYsHAbO7; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-HE-Tag: 1663740421-7740 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In this patch the _unmap and _move stage of the page migration is batched. That for, previously, it is, for each page _unmap() _move() Now, it is, for each page _unmap() for each page _move() Based on this, we can batch the TLB flushing and use some hardware accelerator to copy pages between batched _unmap and batched _move stages. Signed-off-by: "Huang, Ying" Cc: Zi Yan Cc: Yang Shi Cc: Baolin Wang Cc: Oscar Salvador Cc: Matthew Wilcox --- mm/migrate.c | 155 +++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 139 insertions(+), 16 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 1077af858e36..165cbbc834e2 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -996,6 +996,32 @@ static void __migrate_page_extract(struct page *newpage, #define MIGRATEPAGE_UNMAP 1 +static void migrate_page_undo_page(struct page *page, + int page_was_mapped, + struct anon_vma *anon_vma, + struct list_head *ret) +{ + struct folio *folio = page_folio(page); + + if (page_was_mapped) + remove_migration_ptes(folio, folio, false); + if (anon_vma) + put_anon_vma(anon_vma); + unlock_page(page); + list_move_tail(&page->lru, ret); +} + +static void migrate_page_undo_newpage(struct page *newpage, + free_page_t put_new_page, + unsigned long private) +{ + unlock_page(newpage); + if (put_new_page) + put_new_page(newpage, private); + else + put_page(newpage); +} + static int __migrate_page_unmap(struct page *page, struct page *newpage, int force, enum migrate_mode mode) { @@ -1140,6 +1166,8 @@ static int __migrate_page_move(struct page *page, struct page *newpage, rc = move_to_new_folio(dst, folio, mode); + if (rc != -EAGAIN) + list_del(&newpage->lru); /* * When successful, push newpage to LRU immediately: so that if it * turns out to be an mlocked page, remove_migration_ptes() will @@ -1155,6 +1183,11 @@ static int __migrate_page_move(struct page *page, struct page *newpage, lru_add_drain(); } + if (rc == -EAGAIN) { + __migrate_page_record(newpage, page_was_mapped, anon_vma); + return rc; + } + if (page_was_mapped) remove_migration_ptes(folio, rc == MIGRATEPAGE_SUCCESS ? dst : folio, false); @@ -1220,6 +1253,7 @@ static int migrate_page_unmap(new_page_t get_new_page, free_page_t put_new_page, return -ENOMEM; *newpagep = newpage; + newpage->private = 0; rc = __migrate_page_unmap(page, newpage, force, mode); if (rc == MIGRATEPAGE_UNMAP) return rc; @@ -1258,7 +1292,7 @@ static int migrate_page_move(free_page_t put_new_page, unsigned long private, * removed and will be freed. A page that has not been * migrated will have kept its references and be restored. */ - list_del(&page->lru); + list_del_init(&page->lru); } /* @@ -1268,9 +1302,8 @@ static int migrate_page_move(free_page_t put_new_page, unsigned long private, */ if (rc == MIGRATEPAGE_SUCCESS) { migrate_page_done(page, reason); - } else { - if (rc != -EAGAIN) - list_add_tail(&page->lru, ret); + } else if (rc != -EAGAIN) { + list_add_tail(&page->lru, ret); if (put_new_page) put_new_page(newpage, private); @@ -1455,11 +1488,13 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, int pass = 0; bool is_thp = false; struct page *page; - struct page *newpage = NULL; + struct page *newpage = NULL, *newpage2; struct page *page2; int rc, nr_subpages; LIST_HEAD(ret_pages); LIST_HEAD(thp_split_pages); + LIST_HEAD(unmap_pages); + LIST_HEAD(new_pages); bool nosplit = (reason == MR_NUMA_MISPLACED); bool no_subpage_counting = false; @@ -1541,19 +1576,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, nr_subpages = compound_nr(page); cond_resched(); - if (PageHuge(page)) + if (PageHuge(page)) { + list_move_tail(&page->lru, &ret_pages); continue; + } rc = migrate_page_unmap(get_new_page, put_new_page, private, page, &newpage, pass > 2, mode, reason, &ret_pages); - if (rc == MIGRATEPAGE_UNMAP) - rc = migrate_page_move(put_new_page, private, - page, newpage, mode, - reason, &ret_pages); /* * The rules are: * Success: page will be freed + * Unmap: page will be put on unmap_pages list, + * new page put on new_pages list * -EAGAIN: stay on the from list * -ENOMEM: stay on the from list * -ENOSYS: stay on the from list @@ -1589,7 +1624,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, case -ENOMEM: /* * When memory is low, don't bother to try to migrate - * other pages, just exit. + * other pages, move unmapped pages, then exit. */ if (is_thp) { nr_thp_failed++; @@ -1610,9 +1645,11 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, * the caller otherwise the page refcnt will be leaked. */ list_splice_init(&thp_split_pages, from); - /* nr_failed isn't updated for not used */ nr_thp_failed += thp_retry; - goto out; + if (list_empty(&unmap_pages)) + goto out; + else + goto move; case -EAGAIN: if (is_thp) thp_retry++; @@ -1625,6 +1662,10 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, if (is_thp) nr_thp_succeeded++; break; + case MIGRATEPAGE_UNMAP: + list_move_tail(&page->lru, &unmap_pages); + list_add_tail(&newpage->lru, &new_pages); + break; default: /* * Permanent failure (-EBUSY, etc.): @@ -1645,12 +1686,96 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, nr_failed += retry; nr_thp_failed += thp_retry; nr_failed_pages += nr_retry_pages; +move: + retry = 1; + thp_retry = 1; + for (pass = 0; pass < 10 && (retry || thp_retry); pass++) { + retry = 0; + thp_retry = 0; + nr_retry_pages = 0; + + newpage = list_first_entry(&new_pages, struct page, lru); + newpage2 = list_next_entry(newpage, lru); + list_for_each_entry_safe(page, page2, &unmap_pages, lru) { + /* + * THP statistics is based on the source huge page. + * Capture required information that might get lost + * during migration. + */ + is_thp = PageTransHuge(page) && !PageHuge(page); + nr_subpages = compound_nr(page); + cond_resched(); + + rc = migrate_page_move(put_new_page, private, + page, newpage, mode, + reason, &ret_pages); + /* + * The rules are: + * Success: page will be freed + * -EAGAIN: stay on the unmap_pages list + * Other errno: put on ret_pages list then splice to + * from list + */ + switch(rc) { + case -EAGAIN: + if (is_thp) + thp_retry++; + else if (!no_subpage_counting) + retry++; + nr_retry_pages += nr_subpages; + break; + case MIGRATEPAGE_SUCCESS: + nr_succeeded += nr_subpages; + if (is_thp) + nr_thp_succeeded++; + break; + default: + /* + * Permanent failure (-EBUSY, etc.): + * unlike -EAGAIN case, the failed page is + * removed from migration page list and not + * retried in the next outer loop. + */ + if (is_thp) + nr_thp_failed++; + else if (!no_subpage_counting) + nr_failed++; + + nr_failed_pages += nr_subpages; + break; + } + newpage = newpage2; + newpage2 = list_next_entry(newpage, lru); + } + } + nr_failed += retry; + nr_thp_failed += thp_retry; + nr_failed_pages += nr_retry_pages; + + rc = nr_failed + nr_thp_failed; +out: + /* Cleanup remaining pages */ + newpage = list_first_entry(&new_pages, struct page, lru); + newpage2 = list_next_entry(newpage, lru); + list_for_each_entry_safe(page, page2, &unmap_pages, lru) { + int page_was_mapped = 0; + struct anon_vma *anon_vma = NULL; + + __migrate_page_extract(newpage, &page_was_mapped, &anon_vma); + migrate_page_undo_page(page, page_was_mapped, anon_vma, + &ret_pages); + list_del(&newpage->lru); + migrate_page_undo_newpage(newpage, put_new_page, private); + newpage = newpage2; + newpage2 = list_next_entry(newpage, lru); + } + /* * Try to migrate subpages of fail-to-migrate THPs, no nr_failed * counting in this round, since all subpages of a THP is counted * as 1 failure in the first round. */ - if (!list_empty(&thp_split_pages)) { + if (rc >= 0 && !list_empty(&thp_split_pages)) { /* * Move non-migrated pages (after 10 retries) to ret_pages * to avoid migrating them again. @@ -1662,8 +1787,6 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, goto thp_subpage_migration; } - rc = nr_failed + nr_thp_failed; -out: /* * Put the permanent failure page back to migration list, they * will be put back to the right list by the caller.