From patchwork Mon Aug 12 07:42:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13760276 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E45FC52D7C for ; Mon, 12 Aug 2024 07:42:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7BA616B00A9; Mon, 12 Aug 2024 03:42:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7434A6B00AB; Mon, 12 Aug 2024 03:42:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 593CC6B00AC; Mon, 12 Aug 2024 03:42:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 32C756B00A9 for ; Mon, 12 Aug 2024 03:42:33 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id EDDC81A0871 for ; Mon, 12 Aug 2024 07:42:32 +0000 (UTC) X-FDA: 82442800944.04.031DCFE Received: from out30-110.freemail.mail.aliyun.com (out30-110.freemail.mail.aliyun.com [115.124.30.110]) by imf18.hostedemail.com (Postfix) with ESMTP id F2A161C000F for ; Mon, 12 Aug 2024 07:42:30 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=JdMEKtyN; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf18.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.110 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723448495; a=rsa-sha256; cv=none; b=KQQyYdMbwNTkqQD7cofcRn9ZlCtFwnJ1nIQ46BE0k5XLKesSirpbU75btTY3F1MxqAOAHF NmXp0SOxaLeQo3WpDidiqhBaOscpRBpqR3Ae4X5QKGXxwxxEnby+ypGKyK6/leMTUcMNhM H4+icjd8c5EmCaTdVDj0KjHmt5UiGQ0= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=JdMEKtyN; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf18.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.110 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723448495; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=d32a7jBZCt/LbWrkrcBvwBUp97vywsx+8D5iSkO5umo=; b=o39M3uJ/kj/9Snv4P4tFaDEaiOl9wt9wjFShh0L/PZiJNdwEe/Z3dqALl6IAOfK6wSJ3cg RGxH1QQncjyYD7iMpOiAvTmFopgMkpFH9U4Zxd2AyGLBEomBZ4aAq/wzbDwW21J6WMOksM sZFKUn8YXfy4VlYLefCLFWfpCSXRH9M= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1723448548; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=d32a7jBZCt/LbWrkrcBvwBUp97vywsx+8D5iSkO5umo=; b=JdMEKtyNcKmbXkvsgBxmOUadPjq6PRiOn6b0y0PmmUsaAhgLGuSxwVKT+TfFBSewz6N/+OVc5NWcIb5KHoOb2ylmTrJrz/6KJvos5uahGJMIyednlKEU6Jn8JdHmwtak+FzjaTyZIB2/AB6A39hq0hH2Oggy2zPzf+BcHdmE/8A= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WCbAKQr_1723448546) by smtp.aliyun-inc.com; Mon, 12 Aug 2024 15:42:27 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 9/9] mm: shmem: support large folio swap out Date: Mon, 12 Aug 2024 15:42:10 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: F2A161C000F X-Stat-Signature: c1wpbbcahcdx5twspstbmcmdk5bt6588 X-Rspam-User: X-HE-Tag: 1723448550-13536 X-HE-Meta: U2FsdGVkX1/oVhsvECGlp6kMujs9Ra1Lr4oAXoDGIICjCowmJNVjS+Q+q0m2A0rQ3y6WOHoHfjEq6Trq+/yw/O+eHagZ2o07B7Bv1aQl1CypTAZXgblPc3nRVSLkCmY+AV0nQ42gQyZOXlH7YZc/UOgveR45n7cCd4Qa5qPs2iFRlFyVzMffL5REP8CBty/m3R+x8odbc/urRUYIqybJ+KyAKPcC2CcXk20aa4YbAWpnONqnby5Gw2WC2AuFImw3UhIS9WGSwM1HYvDcC13ovgJDs0gG13mGLxes7iP5m+LhE2RxvYJRd5wuC0LYh+nc2/4kpyQs5kxxYu3YDTNdW0ax2GpP2cSnUOXqL590tLOAcgTlbSbXOGNPRqpOqf7LuwNroLfsY5xMdy0kVZHdtnigSwLhsAYfLkU7IZAx22zAM8JtediSFZScxPR+7xumGqMn9RvhV1d4suknduE05dbewxxF1L8Fw5j4LprVc3jDDRgU324vzrW2jr5+V+CO/FRXG2IKJ/wt7hEeHPeYr1iRoHEsJPxqG3wTeYnMbMkM4pYR/jyaYm9YugKn7hNeqe10QBSTA8nyaLCNo0vvZPPt+4Cyzg6yBlYwi7Az5LJZGEazIvuEyC0Y9GkEo8q9Cs1qAYtCnzsKpbhgJ2kpXGQHPv4fg+ID9Lx1W9VFcyny/XLPQnUXGl1pewEYoaX44krShJkuUeKrTjcM7ZgVyvXpniLRKWHHn0FeCktsoOUS1jZR6//3PdGywBOQLl0npBKqpM4ABeWBIBFEMoHwWtiWOFbgHhFCBPPy+uLrUPrqqGnfdK11Jpwf4QfDtDGIV+FmKpPzUkFDhT000Z+qmk6p7AFhWmCX/DkbniyBH8hwmEmdBucz5H0RwDvDJrAo8kSw6dL6KXE2GCu4m/eY/8DMwmYZMK2aUsWShLovg9TjBANo3muB0Rtlr6E/LINJJoP63Jo0YK9XPC+wpmT 1O4sNBH0 7hhCMIcGVwWj7QbX35CvRPfcuYXDqsdONE3GHTnohcgscc5j1O9QGuRwJXMJWE3CeQHUR/K7jSDiEuab1yrIOURPsm9WUxUZu4/wC7FYpyttYEf3ONcghJCKBvz9hb1gZYSYZ+73SJBlljMqhWFidXGHoUktWHsZ93Fz4GGK1XRFiMjhq0XvEjPvuRLufIT2AcC8q2ii1T7FPmV+4jnxWRhP80atMiS//SXbgJz94Vd9+F/U4/L+PCWnoj6utPXPRhCoQ8LUXeF5NTB3czPweYNSzWW3xKpvLlQAIp7DBjSgpJj7yfgrdy4lnIg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Shmem will support large folio allocation [1] [2] to get a better performance, however, the memory reclaim still splits the precious large folios when trying to swap out shmem, which may lead to the memory fragmentation issue and can not take advantage of the large folio for shmeme. Moreover, the swap code already supports for swapping out large folio without split, hence this patch set supports the large folio swap out for shmem. Note the i915_gem_shmem driver still need to be split when swapping, thus add a new flag 'split_large_folio' for writeback_control to indicate spliting the large folio. [1] https://lore.kernel.org/all/cover.1717495894.git.baolin.wang@linux.alibaba.com/ [2] https://lore.kernel.org/all/20240515055719.32577-1-da.gomez@samsung.com/ Signed-off-by: Baolin Wang Signed-off-by: Hugh Dickins --- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 1 + include/linux/writeback.h | 4 +++ mm/shmem.c | 12 ++++++--- mm/vmscan.c | 32 ++++++++++++++++++----- 4 files changed, 38 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index c5e1c718a6d2..c66cb9c585e1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -308,6 +308,7 @@ void __shmem_writeback(size_t size, struct address_space *mapping) .range_start = 0, .range_end = LLONG_MAX, .for_reclaim = 1, + .split_large_folio = 1, }; unsigned long i; diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 1a54676d843a..10100e22d5c6 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -63,6 +63,7 @@ struct writeback_control { unsigned range_cyclic:1; /* range_start is cyclic */ unsigned for_sync:1; /* sync(2) WB_SYNC_ALL writeback */ unsigned unpinned_netfs_wb:1; /* Cleared I_PINNING_NETFS_WB */ + unsigned split_large_folio:1; /* Split large folio for shmem writeback */ /* * When writeback IOs are bounced through async layers, only the @@ -79,6 +80,9 @@ struct writeback_control { */ struct swap_iocb **swap_plug; + /* Target list for splitting a large folio */ + struct list_head *list; + /* internal fields used by the ->writepages implementation: */ struct folio_batch fbatch; pgoff_t index; diff --git a/mm/shmem.c b/mm/shmem.c index 996062dc196b..50aeb03c4d34 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -795,7 +795,6 @@ static int shmem_add_to_page_cache(struct folio *folio, VM_BUG_ON_FOLIO(index != round_down(index, nr), folio); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(!folio_test_swapbacked(folio), folio); - VM_BUG_ON(expected && folio_test_large(folio)); folio_ref_add(folio, nr); folio->mapping = mapping; @@ -1482,10 +1481,11 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc) * "force", drivers/gpu/drm/i915/gem/i915_gem_shmem.c gets huge pages, * and its shmem_writeback() needs them to be split when swapping. */ - if (folio_test_large(folio)) { + if (wbc->split_large_folio && folio_test_large(folio)) { +try_split: /* Ensure the subpages are still dirty */ folio_test_set_dirty(folio); - if (split_huge_page(page) < 0) + if (split_huge_page_to_list_to_order(page, wbc->list, 0)) goto redirty; folio = page_folio(page); folio_clear_dirty(folio); @@ -1527,8 +1527,12 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc) } swap = folio_alloc_swap(folio); - if (!swap.val) + if (!swap.val) { + if (nr_pages > 1) + goto try_split; + goto redirty; + } /* * Add inode to shmem_unuse()'s list of swapped-out inodes, diff --git a/mm/vmscan.c b/mm/vmscan.c index 96ce889ea3d0..ba7b67218caf 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -628,7 +628,7 @@ typedef enum { * Calls ->writepage(). */ static pageout_t pageout(struct folio *folio, struct address_space *mapping, - struct swap_iocb **plug) + struct swap_iocb **plug, struct list_head *folio_list) { /* * If the folio is dirty, only perform writeback if that write @@ -676,6 +676,16 @@ static pageout_t pageout(struct folio *folio, struct address_space *mapping, .swap_plug = plug, }; + /* + * The large shmem folio can be split if CONFIG_THP_SWAP is + * not enabled or contiguous swap entries are failed to + * allocate. + */ + if (shmem_mapping(mapping) && folio_test_large(folio)) { + wbc.list = folio_list; + wbc.split_large_folio = !IS_ENABLED(CONFIG_THP_SWAP); + } + folio_set_reclaim(folio); res = mapping->a_ops->writepage(&folio->page, &wbc); if (res < 0) @@ -1257,11 +1267,6 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, goto activate_locked_split; } } - } else if (folio_test_swapbacked(folio) && - folio_test_large(folio)) { - /* Split shmem folio */ - if (split_folio_to_list(folio, folio_list)) - goto keep_locked; } /* @@ -1362,12 +1367,25 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, * starts and then write it out here. */ try_to_unmap_flush_dirty(); - switch (pageout(folio, mapping, &plug)) { + switch (pageout(folio, mapping, &plug, folio_list)) { case PAGE_KEEP: goto keep_locked; case PAGE_ACTIVATE: + /* + * If shmem folio is split when writeback to swap, + * the tail pages will make their own pass through + * this function and be accounted then. + */ + if (nr_pages > 1 && !folio_test_large(folio)) { + sc->nr_scanned -= (nr_pages - 1); + nr_pages = 1; + } goto activate_locked; case PAGE_SUCCESS: + if (nr_pages > 1 && !folio_test_large(folio)) { + sc->nr_scanned -= (nr_pages - 1); + nr_pages = 1; + } stat->nr_pageout += nr_pages; if (folio_test_writeback(folio))