From patchwork Sat Jun 29 11:10:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13716890 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BD07C27C4F for ; Sat, 29 Jun 2024 11:10:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 99C0B6B0088; Sat, 29 Jun 2024 07:10:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 94A866B0089; Sat, 29 Jun 2024 07:10:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7EC436B008A; Sat, 29 Jun 2024 07:10:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6095A6B0088 for ; Sat, 29 Jun 2024 07:10:44 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id BFD09120E0A for ; Sat, 29 Jun 2024 11:10:43 +0000 (UTC) X-FDA: 82283658366.03.9C7FDE8 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf27.hostedemail.com (Postfix) with ESMTP id DE7B54000E for ; Sat, 29 Jun 2024 11:10:41 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="WxNKZ9/8"; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719659416; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VIIRuBxsg8MoTkqvJ2vYF4yyI+YXFJLITJS/F7oC53Q=; b=gqE6gd+LnoLnY/o1oArd8OXD4mrwgTlNF0znKLKoAW9wJJteWZizxFbKDn7lDuAukAuMjo 2bVl33rSHsnHNpZEtdsNfVuXZhqGsG2KRvnwIs4EsPsGWGq/LWapktkPb6seuEw/3u4CYx EMDr9BOXPA2uCCi+q+6fE8KFjl3j0U4= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="WxNKZ9/8"; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719659416; a=rsa-sha256; cv=none; b=wNNs5w4pbbhaOOjrD7OyJQOzOqpw4xjulRwop8yJdtsH8P9WoSGKbKYdZvfkFAXI87HtVM w1bd0jLIyh0s0Ulu469aLPWsoH97NgeHI820uQQMEKyF+YcSygSZ+cKZId1vQGCu4BszY/ 1YgZV+h9H7hj49TFbUxn7T2ilfV4y+g= Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-1f9c6e59d34so10931445ad.2 for ; Sat, 29 Jun 2024 04:10:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719659441; x=1720264241; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VIIRuBxsg8MoTkqvJ2vYF4yyI+YXFJLITJS/F7oC53Q=; b=WxNKZ9/8JXfNB+6xisfMhUJ3npdcCs5zEkowIxl2IB1AfJOiCsP5mcG2E2cPZzGyey HZnoRREH32/T1pbmtMsc7wmjWMRZ07vYlqfH4JrARNrDVezCyTcgf3nzZHWHmx0DHTUJ IRBb3auDARhFyBPMLescR/eSTLL0NcqYbeH8TGIfoZYZvVwgJ12Vssa7rwzF575iIAsH 91Zn6u1Tt7ivBhT8T7W6yeo/uXmhj10IVOi2xBztGzilNBeahs9P1hYQlGlR2yyIavgi dhtLuyWYmB0QJw3HDCqGKu5/T7RAfdcQ8NmCV60i8D7MNRMJgjBO1bYauhKTz+n+cex0 O9uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719659441; x=1720264241; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VIIRuBxsg8MoTkqvJ2vYF4yyI+YXFJLITJS/F7oC53Q=; b=cO7BgB4weUk6sN3l8AoBNm+Rm7d+BqFXSQUvB6t2PPRhjPxL5faQ6yI3mVZXYuapmw EEONaRcLR75tqgNfjZGbFGYiAf4waOS4pDj5fT7XsqKhWOJAh+XF9E9PYOi+0aJP8o2/ NRIojs6x/clx7IiwmwRkkuWLABDXGa9+kdS3ge4b4a7WhJ9N1Rt4R/7uOCB5uXzpB92R SwDl2WPQA2Lq7eXvZKqqSltI32tgsuBrJe+RZ3j1h56DPblMxSNZz4kp4URqXy2Y9VWT SEYvAUDk+Q4y95Oh8Kq7UM/dfbE3LmCK15jIahTV9s2pX2zMf2ZTVvG1bI14wcJOwjJV Wyuw== X-Forwarded-Encrypted: i=1; AJvYcCVWnUr8n94SRNpm8/BNTxPbnitJnN3lDXpjuk9VcokB7sai32t+Auhz5zTFxnWVbMdXCY+TJPjVHZggPdVTx7fz7Kk= X-Gm-Message-State: AOJu0YyZLRLYunkt8tY1AJcpxv+ajyo4TAHI11jUwClNPDfAee8YOTqO zD8sZBwfNOwYgAEyXLWKbZ19TreBaFoiOoKBIPdS4ejPW7PZwuXT X-Google-Smtp-Source: AGHT+IG6qDn8EOGdMtx+wBwudZlM8Ulw9rvzL2rVxiK7pncEZX4lJDneiLE4Y/g+u+PcPoP0vLfGKQ== X-Received: by 2002:a17:902:d506:b0:1f7:1706:25ba with SMTP id d9443c01a7336-1fadbc84427mr5112845ad.15.1719659440685; Sat, 29 Jun 2024 04:10:40 -0700 (PDT) Received: from localhost.localdomain ([2407:7000:8942:5500:aaa1:59ff:fe57:eb97]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1fac1596920sm30068975ad.268.2024.06.29.04.10.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 29 Jun 2024 04:10:40 -0700 (PDT) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: chrisl@kernel.org, david@redhat.com, hannes@cmpxchg.org, kasong@tencent.com, linux-kernel@vger.kernel.org, mhocko@suse.com, nphamcs@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, surenb@google.com, kaleshsingh@google.com, hughd@google.com, v-songbaohua@oppo.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yosryahmed@google.com, baolin.wang@linux.alibaba.com, shakeel.butt@linux.dev, senozhatsky@chromium.org, minchan@kernel.org Subject: [PATCH RFC v4 1/2] mm: swap: introduce swapcache_prepare_nr and swapcache_clear_nr for large folios swap-in Date: Sat, 29 Jun 2024 23:10:09 +1200 Message-Id: <20240629111010.230484-2-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240629111010.230484-1-21cnbao@gmail.com> References: <20240629111010.230484-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: DE7B54000E X-Stat-Signature: nay4n44dirxes939187qs5hnegbt7rfs X-Rspam-User: X-HE-Tag: 1719659441-752226 X-HE-Meta: U2FsdGVkX186NvRPVSdUyuHjYJ2fHHGHoAeHVzU59myGD3FCQuQ5UCxHW8gcZDAammDlwy24r2tPySlI5URWqOBpybvkCS++k8F+hfXEoElHBomF1hwsXr36HLDkhwGEwMAFVYQSUYBXPPeDLSEhfY77J9G6cOwSh5j8KGhCKLfI6Gt00leMVWOpcWRnMBFRPHfupwbhxO2RQ6Z/JSHUZXM4s9d3cNKNwpackIcPuvZ5ohF66EPymzUwVr/rONqs/ppZlUpbvLwDGHm3qLT7q536ViCi0YfrpCGLBwKVAbUIMflDvTLXZmYLn2SxFQKt+wRYgTCve7QPZJlMRLKDc3CkLoiVlxNP0QAeOhPJZszgYoL25gjccnO3yj/0qNpQgX3DQPjn23H8FPXhtSPYYfqFj1CB4hcNnJs9LRxYmpcKUZMN/0X9MPK9aB8/TqydJRM18vRBxXOpIghvfLz2TA00JVwcGq6WyHq4SUFiHtJZiRUtPczJpU8gBPNuRxtoKBr1AHCRINR/Hc1CKzx11QwOi+107zOSLBo1Zb1E8WZS1z7bQFYbm2+UsITCSzHyEQ+ibNQDdAbgALUQLUP1EA0cKoi5VUGTq85CiHDg/kMrbFOhma9cSPi4vXk0KPgQ/SBVqpC8KssmO2BB5BhG8M1LamM77cE98CjeOPqUJex53DIdKczbyAhQRLuvUSYQ87NY9SklYB3yQdaB5UqJdREt6caxl/NzbHqqfR3ofk+A4xHZz2ZhsSI7i5fQ1NTM4T/OPVpfICrGLf5ZL6ILuJ8M9wjijHgRqwctaMHnJpqUMCw0zrQgdLg8EWPYdS+taduKGFGVWqHRN0V1B4a/DFWRPdCQT5nsDPmj8+YTFjoFvDfRfgj0eoaTRXsncsZjROZ1vX1oWr3yFiqvChN1anFVeQzSNaeDC3PWoW3nXuH6gGoZxNdqXi6FZqFcruzYIf/QNcnlTdIEUc0VB7u +3uvdTzf OTASfc2uunmeHzPENAsp5NG9Cx7fuNk7izTp5r/uZK3msQmA2IoaOKZ32DpbFjNlAcPw+enJ8/15nkzA5bbk0/CBO/rFEPlgxedfV866L2MJrQoU2u5JhBJb92E3cARpOGEmwPCSWhfLCBxUhUVoX1kF+FDGLoJJI3j1RyMtGDxGfP57GglURM82Vdv+ku5Mu9nnPUXI8/K6gjtPSekYp0Bi6UckWeda0COi0C5/gylSUzc4XAaqeFQHuDds2CaBc7+V2H1gz8P+Sy+5wL1Y0ZyzWxEUOeSjX5xED05gm1TqFeg7kB/kwjavGpTd8zSK6sX8R/sOdAsCdHKNHyZif5Gf6EmG7CwohxAWq9qltRQUER1Gar1Vy87vlolVlhmJ/+GLteLCJnZ5sn132GMeh06a9g9PRVUglDRG2LdUzqnTYML2OKZj+7j27xN6xRSCK4W2MU3YMEgrc/YlYbywA2HwdmWe5xJSjVEna X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") supports one entry only, to support large folio swap-in, we need to handle multiple swap entries. Signed-off-by: Barry Song --- include/linux/swap.h | 4 +- mm/swap.h | 4 +- mm/swapfile.c | 114 +++++++++++++++++++++++++------------------ 3 files changed, 70 insertions(+), 52 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index e473fe6cfb7a..c0f4f2073ca6 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -481,7 +481,7 @@ extern int get_swap_pages(int n, swp_entry_t swp_entries[], int order); extern int add_swap_count_continuation(swp_entry_t, gfp_t); extern void swap_shmem_alloc(swp_entry_t); extern int swap_duplicate(swp_entry_t); -extern int swapcache_prepare(swp_entry_t); +extern int swapcache_prepare_nr(swp_entry_t entry, int nr); extern void swap_free_nr(swp_entry_t entry, int nr_pages); extern void swapcache_free_entries(swp_entry_t *entries, int n); extern void free_swap_and_cache_nr(swp_entry_t entry, int nr); @@ -555,7 +555,7 @@ static inline int swap_duplicate(swp_entry_t swp) return 0; } -static inline int swapcache_prepare(swp_entry_t swp) +static inline int swapcache_prepare_nr(swp_entry_t swp, int nr) { return 0; } diff --git a/mm/swap.h b/mm/swap.h index baa1fa946b34..b96b1157441f 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -59,7 +59,7 @@ void __delete_from_swap_cache(struct folio *folio, void delete_from_swap_cache(struct folio *folio); void clear_shadow_from_swap_cache(int type, unsigned long begin, unsigned long end); -void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry); +void swapcache_clear_nr(struct swap_info_struct *si, swp_entry_t entry, int nr); struct folio *swap_cache_get_folio(swp_entry_t entry, struct vm_area_struct *vma, unsigned long addr); struct folio *filemap_get_incore_folio(struct address_space *mapping, @@ -120,7 +120,7 @@ static inline int swap_writepage(struct page *p, struct writeback_control *wbc) return 0; } -static inline void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry) +static inline void swapcache_clear_nr(struct swap_info_struct *si, swp_entry_t entry, int nr) { } diff --git a/mm/swapfile.c b/mm/swapfile.c index f7224bc1320c..8f60dd10fdef 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1352,7 +1352,8 @@ static void swap_entry_free(struct swap_info_struct *p, swp_entry_t entry) } static void cluster_swap_free_nr(struct swap_info_struct *sis, - unsigned long offset, int nr_pages) + unsigned long offset, int nr_pages, + unsigned char usage) { struct swap_cluster_info *ci; DECLARE_BITMAP(to_free, BITS_PER_LONG) = { 0 }; @@ -1362,7 +1363,7 @@ static void cluster_swap_free_nr(struct swap_info_struct *sis, while (nr_pages) { nr = min(BITS_PER_LONG, nr_pages); for (i = 0; i < nr; i++) { - if (!__swap_entry_free_locked(sis, offset + i, 1)) + if (!__swap_entry_free_locked(sis, offset + i, usage)) bitmap_set(to_free, i, 1); } if (!bitmap_empty(to_free, BITS_PER_LONG)) { @@ -1396,7 +1397,7 @@ void swap_free_nr(swp_entry_t entry, int nr_pages) while (nr_pages) { nr = min_t(int, nr_pages, SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); - cluster_swap_free_nr(sis, offset, nr); + cluster_swap_free_nr(sis, offset, nr, 1); offset += nr; nr_pages -= nr; } @@ -3382,7 +3383,7 @@ void si_swapinfo(struct sysinfo *val) } /* - * Verify that a swap entry is valid and increment its swap map count. + * Verify that nr swap entries are valid and increment their swap map counts. * * Returns error code in following case. * - success -> 0 @@ -3392,66 +3393,88 @@ void si_swapinfo(struct sysinfo *val) * - swap-cache reference is requested but the entry is not used. -> ENOENT * - swap-mapped reference requested but needs continued swap count. -> ENOMEM */ -static int __swap_duplicate(swp_entry_t entry, unsigned char usage) +static int __swap_duplicate_nr(swp_entry_t entry, unsigned char usage, int nr) { struct swap_info_struct *p; struct swap_cluster_info *ci; unsigned long offset; unsigned char count; unsigned char has_cache; - int err; + int err, i; p = swp_swap_info(entry); offset = swp_offset(entry); + VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); ci = lock_cluster_or_swap_info(p, offset); - count = p->swap_map[offset]; + err = 0; + for (i = 0; i < nr; i++) { + count = p->swap_map[offset + i]; - /* - * swapin_readahead() doesn't check if a swap entry is valid, so the - * swap entry could be SWAP_MAP_BAD. Check here with lock held. - */ - if (unlikely(swap_count(count) == SWAP_MAP_BAD)) { - err = -ENOENT; - goto unlock_out; - } + /* + * swapin_readahead() doesn't check if a swap entry is valid, so the + * swap entry could be SWAP_MAP_BAD. Check here with lock held. + */ + if (unlikely(swap_count(count) == SWAP_MAP_BAD)) { + err = -ENOENT; + goto unlock_out; + } - has_cache = count & SWAP_HAS_CACHE; - count &= ~SWAP_HAS_CACHE; - err = 0; + has_cache = count & SWAP_HAS_CACHE; + count &= ~SWAP_HAS_CACHE; - if (usage == SWAP_HAS_CACHE) { + if (usage == SWAP_HAS_CACHE) { + /* set SWAP_HAS_CACHE if there is no cache and entry is used */ + if (!has_cache && count) + continue; + else if (has_cache) /* someone else added cache */ + err = -EEXIST; + else /* no users remaining */ + err = -ENOENT; - /* set SWAP_HAS_CACHE if there is no cache and entry is used */ - if (!has_cache && count) - has_cache = SWAP_HAS_CACHE; - else if (has_cache) /* someone else added cache */ - err = -EEXIST; - else /* no users remaining */ - err = -ENOENT; + } else if (count || has_cache) { - } else if (count || has_cache) { + if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) + continue; + else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) + err = -EINVAL; + else if (swap_count_continued(p, offset + i, count)) + continue; + else + err = -ENOMEM; + } else + err = -ENOENT; /* unused swap entry */ - if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) + if (err) + goto unlock_out; + } + + for (i = 0; i < nr; i++) { + count = p->swap_map[offset + i]; + has_cache = count & SWAP_HAS_CACHE; + count &= ~SWAP_HAS_CACHE; + + if (usage == SWAP_HAS_CACHE) + has_cache = SWAP_HAS_CACHE; + else if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) count += usage; - else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) - err = -EINVAL; - else if (swap_count_continued(p, offset, count)) - count = COUNT_CONTINUED; else - err = -ENOMEM; - } else - err = -ENOENT; /* unused swap entry */ + count = COUNT_CONTINUED; - if (!err) - WRITE_ONCE(p->swap_map[offset], count | has_cache); + WRITE_ONCE(p->swap_map[offset + i], count | has_cache); + } unlock_out: unlock_cluster_or_swap_info(p, ci); return err; } +static int __swap_duplicate(swp_entry_t entry, unsigned char usage) +{ + return __swap_duplicate_nr(entry, usage, 1); +} + /* * Help swapoff by noting that swap entry belongs to shmem/tmpfs * (in which case its reference count is never incremented). @@ -3485,22 +3508,17 @@ int swap_duplicate(swp_entry_t entry) * -EEXIST means there is a swap cache. * Note: return code is different from swap_duplicate(). */ -int swapcache_prepare(swp_entry_t entry) +int swapcache_prepare_nr(swp_entry_t entry, int nr) { - return __swap_duplicate(entry, SWAP_HAS_CACHE); + return __swap_duplicate_nr(entry, SWAP_HAS_CACHE, nr); } -void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry) +void swapcache_clear_nr(struct swap_info_struct *si, swp_entry_t entry, int nr) { - struct swap_cluster_info *ci; - unsigned long offset = swp_offset(entry); - unsigned char usage; + pgoff_t offset = swp_offset(entry); - ci = lock_cluster_or_swap_info(si, offset); - usage = __swap_entry_free_locked(si, offset, SWAP_HAS_CACHE); - unlock_cluster_or_swap_info(si, ci); - if (!usage) - free_swap_slot(entry); + VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); + cluster_swap_free_nr(si, offset, nr, SWAP_HAS_CACHE); } struct swap_info_struct *swp_swap_info(swp_entry_t entry)