From patchwork Fri Sep 6 00:10:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13793110 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F41FCE7AA5 for ; Fri, 6 Sep 2024 00:11:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A99016B0089; Thu, 5 Sep 2024 20:11:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A48FD6B008A; Thu, 5 Sep 2024 20:11:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C2B36B008C; Thu, 5 Sep 2024 20:11:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6C2536B0089 for ; Thu, 5 Sep 2024 20:11:18 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id D0AB3140E49 for ; Fri, 6 Sep 2024 00:11:17 +0000 (UTC) X-FDA: 82532383794.18.2577C52 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf04.hostedemail.com (Postfix) with ESMTP id 007DF40022 for ; Fri, 6 Sep 2024 00:11:15 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=STmkpTek; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf04.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725581451; a=rsa-sha256; cv=none; b=vC26hByD+qwPYONXX8kvJHaeREwO8L9XJkUQ2d6zLkFVVrksoRS9Yn7lqF48vXEw5r08oG /5pWULz0WKxc6m//GIWTOU+fGcbx5yCmNFV3VuqJyWyTuhBqo47l84bV6/mTkg5hfhlPHP hO8JstGRzwingTI2lvR2nVG+JMB0TJk= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=STmkpTek; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf04.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725581451; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8QXDrXH1Ww24AqoLO4gcy/YnLraR5k8WZQjyiNk/5Pk=; b=B+nBqiKgev9TZOzC+Ye5tl8fuTex5nYVcz8RsDlJqo6u347POpGMpSsZ4NcRjZQMLy18Rm wdBkazX/OV+Q5mWVzKLhftjlqp9KYWAyIHIdAtV0bheTMAb09YwlsQbsQQQMSawTJZluPq Wz+RRdciuOvaob7VDCCRKhRjtTaAKz4= Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-206bd1c6ccdso14091325ad.3 for ; Thu, 05 Sep 2024 17:11:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725581475; x=1726186275; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8QXDrXH1Ww24AqoLO4gcy/YnLraR5k8WZQjyiNk/5Pk=; b=STmkpTekNJdZyRtBkZF/t+OrgKyxqTMyuG8YgE8QWOhcYhg5K3KA2f2cbQmpMZTdTo D6Dkdm7NIEfwSGj7n+IBedhJwwCSA7t4tDTJfLxj1RfIWo/5Qkr5/wrArLD/7nQkLi9Y QA/zIWFhdoJxSqRw3DPTQ5LFZdioSplPmGGNc+NL4N8WAHG8Eo1I5NFgHLzwJBq3vcOp OLZ47dpw9tYtDURJaxlmhiORXPl19Qw4rKBglK+SLlU51Hn9F9oAsORnHkHyhDaqi/zC HiLPDXDYa/T7B/JCniOUUDXIVDYj8Ac1cciQZYwAnHk68X7FTrZD2B5KIZdz6NSU8Grc 4B3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725581475; x=1726186275; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8QXDrXH1Ww24AqoLO4gcy/YnLraR5k8WZQjyiNk/5Pk=; b=ut6W1vraBrq9ZYD69drnJH10VCgfomjuo59HoQdr+WDP7BV0ydAKvHTomuFMGiyzvD qK8JcX7aIA7sTnbgwDlZwpJFiFqLauNA+ZxNwZbJ39PqxXXBQGD9JdCItCmzy1QWjTE5 HExmv/euYwUAU2S8PoyIDMUtrFMcdfQ/+vupyiKLkdhYTSXsSnlbit8cZsHS2pG67kJV vewQs7GS2Za3dQ5WiPh8hfYEfA2UGhEA08Q+KlAqpyOsqUT4AKoV7/EKitU9W93ZYX8g U6V+S2dwUXpJgZTh9QrcyuhZgz3WEZh4hrlfyLUudpk8isQeM3YDXZvgGEnoTPDxLXJ6 rQ4w== X-Forwarded-Encrypted: i=1; AJvYcCXrDkxcGyQhhjVAyI84JEzH/BuKbjcOY8sOT4Cfl3s4vj6R0CsI59hCnYIbfZWUsAzJidtBV5kJTw==@kvack.org X-Gm-Message-State: AOJu0YzwLEaAJtLj4EuQG3ywzr3BZp+voe73S1o1NS4LJ3lGsH/cPlW5 72F8O5SVarSSPQGFn/iI7xYBei+aKOYqdy0ZfGwsIyZYLc2t38DK X-Google-Smtp-Source: AGHT+IFvLappnY16i84maiK/47bZsBM5+lg21Pra1SjZeJopowh0cGVj6dcf6eL6EAv/dFBjX036Cw== X-Received: by 2002:a17:902:ce81:b0:1fb:1afb:b864 with SMTP id d9443c01a7336-206f049d07amr8865735ad.5.1725581474696; Thu, 05 Sep 2024 17:11:14 -0700 (PDT) Received: from Barrys-MBP.hub ([2407:7000:8942:5500:cd42:f8ae:5222:5ab7]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-206aea3548dsm33447445ad.140.2024.09.05.17.11.05 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 05 Sep 2024 17:11:14 -0700 (PDT) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hanchuanhua@oppo.com, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, hannes@cmpxchg.org, hch@infradead.org, hughd@google.com, kaleshsingh@google.com, kasong@tencent.com, linux-kernel@vger.kernel.org, mhocko@suse.com, minchan@kernel.org, nphamcs@gmail.com, ryan.roberts@arm.com, ryncsn@gmail.com, senozhatsky@chromium.org, shakeel.butt@linux.dev, shy828301@gmail.com, surenb@google.com, v-songbaohua@oppo.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yosryahmed@google.com, Usama Arif Subject: [PATCH v8 1/3] mm: Fix swap_read_folio_zeromap() for large folios with partial zeromap Date: Fri, 6 Sep 2024 12:10:45 +1200 Message-Id: <20240906001047.1245-2-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240906001047.1245-1-21cnbao@gmail.com> References: <20240906001047.1245-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 007DF40022 X-Rspamd-Server: rspam01 X-Stat-Signature: d595cs49e1utc5utc8cd3poiitz5noj7 X-HE-Tag: 1725581475-927709 X-HE-Meta: U2FsdGVkX1+vi2HDMY+5HNqD9tvBhR2db+DoxyYrNxKAglCqCvmTclP6IcpE9SeR+KLbQ7aLgN4SsKvyU/NotyJp5SNqDtrUXYpgKToisc6cEArolBnw46Bz5xpf2nEQjZq6rPH/qkIbCD/vjF8eohjwdf/pil8twBK07zMB+AEK4hYRcSw0KMCJRXF8Xcw5zQ7cYXwQ1m2P2ZD1crTJXOZCjbJ5UVBJ5tA476+7MObbo8n77uHJILUDkL4a5PQXhqw7oyrcxTDW0fWkaElqKj6fPVCmHKBIGSqXAv25NoWkJ80h3aQ0DEtafKkVeC9NeB+QqOxhTpWjtCl+wIOHfORItKG4TjTIaDo/JM5M2131znbNO+J8s8E5FShQNe+lgpc05IuB15NAgXVRzEznk+vz6K9KbSi3aeO/bcljvDRGbLZxacmbu6cotaUHtccIDNjBJAmW1XkFUnVVWOo/HjE0YJJ5UuCqYh+VN95v1DDtqJ0ei0xt1y4QCJ/rFWZo/+B5CSvJWOsJMUOPsWxWoovkW1FnJz+Imx03BZbTiA4rwmVFYu2V/Ul1CMHZmkVbfEBOYNAo8HdXO92+XkTuFM08P/Iftctrezk0VKjS/DlCG4I3UDrZfAAmGihJVpSNyQ89IKsX4z4bx3OSG0AkdtlvcxsxqDO4umYWhpst4On0IdhNgLgVy0gUMxPYIAlx22fey93T0KU+Qf7vGdnKjIVkeLNBhlnm6/eUdoMfsXsAQU/soIB31YYczBNhGdrUIf5NOMkzfQuctkTTAZTwvePA8/HEZ+FZYG3vmB8i4wk8DuXTbG18S1kAkvIDD+a2jtvJ6T+KUC1pJU87DpR0e5kSOnsOUN3PGazqMonA+qYSmigjf2rRdvV7twWt+CZgs0XwIF3QfK4KgPmWp55VEp2NzGi6/GIr7IqBkaLrMVkEUkEc97Qi6eCN9hIOk+v3ycTdvgSBV5UlZ/osL7e kIemMjo+ BBBOIj9qeKFwQn9eL18OKu0G9HOl4+iqeukO6+BnjKuQsg2uKqpjVYy0e/gBAagnN02iPFYwnJZZyzBVZRt/ddHIeinqNvoKvqCi8U8HdzWb3728aDWTQQjiJ8vBwcgsEVbsZhrQZJL4FgWu6EVcn1v4QoOT/R2D6dveIw6cZb1mNDIe9fHMiQ3ybSsVO+MgmZpvB1L9PPXIB0KLUDDBNdBH0UXDRazCL8KZ/MNKX+MJWBG12/ooYsyL6g/KIePbCV8WFufftT+FZ2/EfUkFHCXPW6MChJhFueki3cwR+umF+1edNvYPUTHSHkyZMRubrGaisLwUogqLafjCU8j/6Ft0f5weCX5ctDsH+KvfLFwMqNiTbOfrY/+qCOYICCct1Uti4qttQ7vj6IwUlSny5q3/jNES6yZ5H5LhNBJuNyf3HE8RhR2JV1P66uEWQeB8sq/Dj8piQcdZilssAQu6qbgg9OIdI07/gPydYJU3Ywh/yRY4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song There could be a corner case where the first entry is non-zeromap, but a subsequent entry is zeromap. In this case, we should not let swap_read_folio_zeromap() return false since we will still read corrupted data. Additionally, the iteration of test_bit() is unnecessary and can be replaced with bitmap operations, which are more efficient. We can adopt the style of swap_pte_batch() and folio_pte_batch() to introduce swap_zeromap_batch() which seems to provide the greatest flexibility for the caller. This approach allows the caller to either check if the zeromap status of all entries is consistent or determine the number of contiguous entries with the same status. Since swap_read_folio() can't handle reading a large folio that's partially zeromap and partially non-zeromap, we've moved the code to mm/swap.h so that others, like those working on swap-in, can access it. Fixes: 0ca0c24e3211 ("mm: store zero pages to be swapped out in a bitmap") Cc: Usama Arif Cc: Yosry Ahmed Signed-off-by: Barry Song Signed-off-by: Barry Song Reviewed-by: Yosry Ahmed --- mm/page_io.c | 32 +++++++------------------------- mm/swap.h | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 40 insertions(+), 25 deletions(-) diff --git a/mm/page_io.c b/mm/page_io.c index 4bc77d1c6bfa..2dfe2273a1f1 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -226,26 +226,6 @@ static void swap_zeromap_folio_clear(struct folio *folio) } } -/* - * Return the index of the first subpage which is not zero-filled - * according to swap_info_struct->zeromap. - * If all pages are zero-filled according to zeromap, it will return - * folio_nr_pages(folio). - */ -static unsigned int swap_zeromap_folio_test(struct folio *folio) -{ - struct swap_info_struct *sis = swp_swap_info(folio->swap); - swp_entry_t entry; - unsigned int i; - - for (i = 0; i < folio_nr_pages(folio); i++) { - entry = page_swap_entry(folio_page(folio, i)); - if (!test_bit(swp_offset(entry), sis->zeromap)) - return i; - } - return i; -} - /* * We may have stale swap cache pages in memory: notice * them here and get rid of the unnecessary final write. @@ -524,19 +504,21 @@ static void sio_read_complete(struct kiocb *iocb, long ret) static bool swap_read_folio_zeromap(struct folio *folio) { - unsigned int idx = swap_zeromap_folio_test(folio); - - if (idx == 0) - return false; + int nr_pages = folio_nr_pages(folio); + bool is_zeromap; + int nr_zeromap = swap_zeromap_batch(folio->swap, nr_pages, &is_zeromap); /* * Swapping in a large folio that is partially in the zeromap is not * currently handled. Return true without marking the folio uptodate so * that an IO error is emitted (e.g. do_swap_page() will sigbus). */ - if (WARN_ON_ONCE(idx < folio_nr_pages(folio))) + if (WARN_ON_ONCE(nr_zeromap != nr_pages)) return true; + if (!is_zeromap) + return false; + folio_zero_range(folio, 0, folio_size(folio)); folio_mark_uptodate(folio); return true; diff --git a/mm/swap.h b/mm/swap.h index f8711ff82f84..1cc56a02fb5f 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -80,6 +80,32 @@ static inline unsigned int folio_swap_flags(struct folio *folio) { return swp_swap_info(folio->swap)->flags; } + +/* + * Return the count of contiguous swap entries that share the same + * zeromap status as the starting entry. If is_zeromap is not NULL, + * it will return the zeromap status of the starting entry. + */ +static inline int swap_zeromap_batch(swp_entry_t entry, int max_nr, + bool *is_zeromap) +{ + struct swap_info_struct *sis = swp_swap_info(entry); + unsigned long start = swp_offset(entry); + unsigned long end = start + max_nr; + bool start_entry_zeromap; + + start_entry_zeromap = test_bit(start, sis->zeromap); + if (is_zeromap) + *is_zeromap = start_entry_zeromap; + + if (max_nr <= 1) + return max_nr; + if (start_entry_zeromap) + return find_next_zero_bit(sis->zeromap, end, start) - start; + else + return find_next_bit(sis->zeromap, end, start) - start; +} + #else /* CONFIG_SWAP */ struct swap_iocb; static inline void swap_read_folio(struct folio *folio, struct swap_iocb **plug) @@ -171,6 +197,13 @@ static inline unsigned int folio_swap_flags(struct folio *folio) { return 0; } + +static inline int swap_zeromap_batch(swp_entry_t entry, int max_nr, + bool *has_zeromap) +{ + return 0; +} + #endif /* CONFIG_SWAP */ #endif /* _MM_SWAP_H */