From patchwork Tue Jun 18 23:26:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13703157 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7967FC27C4F for ; Tue, 18 Jun 2024 23:27:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DC2716B03D4; Tue, 18 Jun 2024 19:27:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D47296B03D6; Tue, 18 Jun 2024 19:27:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE8CC6B03D7; Tue, 18 Jun 2024 19:27:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9DC5C6B03D4 for ; Tue, 18 Jun 2024 19:27:02 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 450B7A26AF for ; Tue, 18 Jun 2024 23:27:02 +0000 (UTC) X-FDA: 82245597084.11.D5E407E Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf29.hostedemail.com (Postfix) with ESMTP id B056312000C for ; Tue, 18 Jun 2024 23:27:00 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf29.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718753214; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8rhHmrdIQQp7zLMWL//6xeZhYwQubqijSMco3o0xBWo=; b=Hebg2usqUiR0/feAVu4fDjtu9TscKIy3gv4/MKt5LUVmLhMkgTHrwXQbY/4zYeHJDRUYjD i8W++V4gDmFu0V3WT/FGdsXiuMG5VC4YAVRRdrdkw8HSXki5JDwguZGRLsICZU1D3vmVgW u1ug12zUBmRTgVeJ3eHZMlEzGGKXn+M= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718753214; a=rsa-sha256; cv=none; b=FvTceEq8YriVsCJeXEuPTyW78tKN4x5CgLjlHVvI81NI4RR0xS5HwwuyE9iZ/wabzHVu7Y xbT9gJazDQ8d1dEuzcrsnR8VbvJNLeCg2YFdiJQwt37mzJhkP8aX2HM/+T2K0Aku92or9u qjs+Es6EudIP29ukiIKmq7vGUOFmsXw= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf29.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5CD461042; Tue, 18 Jun 2024 16:27:24 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EECF83F64C; Tue, 18 Jun 2024 16:26:57 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Chris Li , Kairui Song , "Huang, Ying" , Kalesh Singh , Barry Song , Hugh Dickins , David Hildenbrand Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v1 1/5] mm: swap: Simplify end-of-cluster calculation Date: Wed, 19 Jun 2024 00:26:41 +0100 Message-ID: <20240618232648.4090299-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240618232648.4090299-1-ryan.roberts@arm.com> References: <20240618232648.4090299-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: B056312000C X-Stat-Signature: nk1g4zfyzbtoi8i3wwrfmy8jj4okq6as X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1718753220-411802 X-HE-Meta: U2FsdGVkX1+yTDrZcQ5JikBqfgbF43UIlUtv64sSxW7UlVbuR4i6KHuWr6ZjUTGKWhbSR1Usjg8Of9OF8qDKCfO4dDWg1AhZiZsN4W++gCS7CWTj73uniWb+rywmjiWxq8oaVSFAS2kMyiRlCGLCiv/aIWZ1rT4Cs4C29TDDtJ5xz8MzHCIwLxXT728SriZ3IwRtuZlgkz1Q0V5NRxWGKsbgI7A/zQvAgBRKdZVRvV2RVP0SkAJSUYjJ3f87BGAdmznAFRSNxKfQmBUnaJsHYPgzJXfORlRICtHAbobja+ms0pfpomg8HWAEdLz3cBmT8JYMPKOe4pAsVARnTIPY2mPDZiBnQWN1R8/+o2pkHzAu0CTX12N1ikbrn38VLVHWa8MDU12C6PpIEgypRuauAjVHh8FVq5Cu2AMSPkFi8izSbkhQcvfxsaH5ROJ2Zc96DxfVI53UWO5pAlFMY/JnEbX3dOapo6igTRgRuOnJhazXnhCsTWc+eLP6XuwpZmzA+SzUiwVNk+ehzRTpI+O2bcGSofk/e3iZvY4lAArylSasYAAE6swBrxbBtCXHmi8F+5dw8/t4cQ8lsa0PaEsEBjGUhy3Bq0zvGNSrHJ1mm+L2Y/5ruixvTZnJxsTfmp4mICTicB6SrzU75uqdeRzsfCdIAwg9KWvaXcGfx0koHjX4qRsE4YsoXwN/sM9aMZoMFcG38m75XqZTSfNDK+nOzOFcB+9jtmEUf5e+5HEMBeR8dcBNy3bvuE3LLxgdlWFMO9q2kO5zXcs8+LmRPjZ/JgLURfccw9wPokzdHi74nr4Gx7341fQi8i9P7HpYtxLf7N7hag+Grv23etOTvla3puynQua9yDgw1B5ZbWvvWwPWeIf26kqQdL0w3bXExPSRTPJW8ngW29XvKP9YHurGlBTxdRdAhcXSCkx/6fL38rEcsQB8hwXYFrgWuaOXYr/gGhl0f5cFB8/TEn3GUTL 43bktwip T+mfcPKzUvsz75yDmh1yYS3cGYMkqWH4p85CMSjm/a/gKGrG9ZPulPmq6iASr+4H8et3YRBShrmYadY5WwUb3QSVT/IO0rh5yVqj91r8eqshShmJGsR5QipdwiqQ/gGy4cPLKAjCcSONa3cABVsY7Yz8O+w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Its possible that a swap file will have a partial cluster at the end, if the swap size is not a multiple of the cluster size. But this partial cluster will never be marked free and so scan_swap_map_try_ssd_cluster() will never see it. Therefore it can always consider that a cluster ends at the next cluster boundary. This leads to a simplification of the endpoint calculation and removal of an unnecessary conditional. This change has the useful side effect of making lock_cluster() unconditional, which will be used in a later commit. Signed-off-by: Ryan Roberts --- mm/swapfile.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) -- 2.43.0 diff --git a/mm/swapfile.c b/mm/swapfile.c index b3e5e384e330..30e79739dfdc 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -677,16 +677,14 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, * check if there is still free entry in the cluster, maintaining * natural alignment. */ - max = min_t(unsigned long, si->max, ALIGN(tmp + 1, SWAPFILE_CLUSTER)); - if (tmp < max) { - ci = lock_cluster(si, tmp); - while (tmp < max) { - if (swap_range_empty(si->swap_map, tmp, nr_pages)) - break; - tmp += nr_pages; - } - unlock_cluster(ci); + max = ALIGN(tmp + 1, SWAPFILE_CLUSTER); + ci = lock_cluster(si, tmp); + while (tmp < max) { + if (swap_range_empty(si->swap_map, tmp, nr_pages)) + break; + tmp += nr_pages; } + unlock_cluster(ci); if (tmp >= max) { cluster->next[order] = SWAP_NEXT_INVALID; goto new_cluster; From patchwork Tue Jun 18 23:26:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13703158 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A8B9C2BA15 for ; Tue, 18 Jun 2024 23:27:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C14756B03D6; Tue, 18 Jun 2024 19:27:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AFF4D6B03D8; Tue, 18 Jun 2024 19:27:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92C146B03D9; Tue, 18 Jun 2024 19:27:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 748486B03D6 for ; Tue, 18 Jun 2024 19:27:04 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 2B153A0BBB for ; Tue, 18 Jun 2024 23:27:04 +0000 (UTC) X-FDA: 82245597168.19.8BFEED4 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf25.hostedemail.com (Postfix) with ESMTP id 8403FA0003 for ; Tue, 18 Jun 2024 23:27:02 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf25.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718753214; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zhkHJVJk+VCj5Q4TfdiGp+1tUTeyhoebkIQEtvq7faY=; b=dd+a73U0yEr2Hl5Jpu5wHNpkgvjAN8SyHWQzC1cfX4lbkis5WRCmbEI+Ki1Gkw5xPM7kw9 xhC1N9crf3CvaX2zO8BVl+iS1CkFxlojZBdXayeOQb+5DkHz1nblOncHparAbq84fnBdph 0n0DksQJytVqxZh3dIiWeMGEtjFJNFg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718753214; a=rsa-sha256; cv=none; b=N2hvWtAiNBpjRZHLrax1eMPfYi+Vyd/csmMODwShXalywnVQYsSrAmhLsnk7oy8YjYyYNs muN8H0VtJWrPIgJP6vEZlCL/F+5d0ljbOHYN0eQJ5/Nap4/uCFQ52HHqEHfd5maRCS340t Ms0Vx9P9Kx+KRsMwm83GOL4VgjXA8Zc= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf25.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5C3A61474; Tue, 18 Jun 2024 16:27:26 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EE2AE3F64C; Tue, 18 Jun 2024 16:26:59 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Chris Li , Kairui Song , "Huang, Ying" , Kalesh Singh , Barry Song , Hugh Dickins , David Hildenbrand Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v1 2/5] mm: swap: Change SWAP_NEXT_INVALID to highest value Date: Wed, 19 Jun 2024 00:26:42 +0100 Message-ID: <20240618232648.4090299-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240618232648.4090299-1-ryan.roberts@arm.com> References: <20240618232648.4090299-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 8403FA0003 X-Stat-Signature: gedqkny9m8hchdcy5co4so5nwt616d61 X-Rspam-User: X-HE-Tag: 1718753222-135683 X-HE-Meta: U2FsdGVkX1+GrWBYNr1u+UUApJmjXBq8NTQnn+J4zICkk2y3XRpeExT1ix3FZBW2ebVUXfxh8aXGsgh0Gq6gQ3VwXAA14qH1PLsEEzxlvgJpLzDn7foXx3XdkrFByHYParWnx1zEoyyHtof8eYzP5CNtpdzExMCMOjuW4AlzNecTgJTDMlLgFgIQcojgtATVWN5pslN9D40aOxW8Wby3zYYWMy8QnR6RQvgUXnGUmHJll9t8Q2Pf41bUH7uJjR/c2ww+qSTdBHu5eFVYs0WIyYiFp/HdT959CEKhMLpSO29AYpSGCIE5+L5yKRSRsN96Fm/kRn+e8E5cmgpBsC4QeaybPUv47SpR6KuERR6oIyfPxhyBjjlzaNrXkkvrBQn0ilROWyuLPLwk8CrSDg7BUyAqZseL6B+qJTLq6YbmDObnXySRaJYpbkngKudi6LLqJHmUh2hy0L58B02VZhvL9/AUPr1X8qL3VgKmG7PSOQwcnt510jAGhaa0d3XvlIKeBdlxYzUg4cv+RQA0zVcYSDdGA8z9p/6yCKb4zHOCiQ2Ry+iI4sAav+jeh/3/j8OuZo5JYHBSOFjdiFZamRL1qAhX3f3CKYEWloyRpo6kW5OuSrBujFf+cI5Ir59WJ3rrIPVtUdPP4RRIlB7L6ZoNdrRE9jwQXRrqU5738KBvodWikEg6003M3tgeDrARnHMJdxuWNs7KWkzpU2OBKFIaf/wMdDKlOCF3G5StNts284OpdK/Y8w1v/K1IZsbpZC/v4zF5E/0Q8Pvx3qrXpgGOAfQFGkoIQPRLOswuUDvkiALEul1wYtORoGaKDyJ4MqSm2vpcH9vX0NmjZbxkKkGfV5N/bu/XAx7IxhjfE83RX1umR8wbtCjoHM9yS+wjUMS5qbZ/GfUfKefywfagFzGqIS7sd1X93U+80gLZOJXWkTXHcEZbuVlyTU948Kdnh0asj3RYOZD3412oOQGUppA iAlB4jnY 8RfT3wKbhPuxvTnKF3pXrvie584oLX1PtJQS6/Gd9ILmXbW8kRTJ0H4oLzxd6Tl/QClJbbEGMyT1/JJZLzvX3ZWF8kbQ9IDZjjmneUhlC8bRL/IXJBB4s6elGDtDVBPjKE8g4jjqLc8/QU0SYISk0KrgJlFFIi0tpflvz0o6VxeK8o3nOYmC5RpTO8lpXbantGmkxnqk43x7Kbss= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We are about to introduce a scanning mechanism that can present 0 as a valid cluster offset to scan_swap_map_try_ssd_cluster(), so let's change SWAP_NEXT_INVALID to UINT_MAX, which is always invalid as an offset in practice. Signed-off-by: Ryan Roberts --- include/linux/swap.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) -- 2.43.0 diff --git a/include/linux/swap.h b/include/linux/swap.h index bd450023b9a4..66566251ba31 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -261,12 +261,12 @@ struct swap_cluster_info { #define CLUSTER_FLAG_NEXT_NULL 2 /* This cluster has no next cluster */ /* - * The first page in the swap file is the swap header, which is always marked - * bad to prevent it from being allocated as an entry. This also prevents the - * cluster to which it belongs being marked free. Therefore 0 is safe to use as - * a sentinel to indicate next is not valid in percpu_cluster. + * swap_info_struct::max is an unsigned int, so the maximum number of pages in + * the swap file is UINT_MAX. Therefore the highest legitimate index is + * UINT_MAX-1. Therefore UINT_MAX is safe to use as a sentinel to indicate next + * is not valid in percpu_cluster. */ -#define SWAP_NEXT_INVALID 0 +#define SWAP_NEXT_INVALID UINT_MAX #ifdef CONFIG_THP_SWAP #define SWAP_NR_ORDERS (PMD_ORDER + 1) From patchwork Tue Jun 18 23:26:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13703159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7965C27C4F for ; Tue, 18 Jun 2024 23:27:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 029B76B03D8; Tue, 18 Jun 2024 19:27:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EF7DF6B03DA; Tue, 18 Jun 2024 19:27:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D49506B03DB; Tue, 18 Jun 2024 19:27:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B87656B03D8 for ; Tue, 18 Jun 2024 19:27:06 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 32E2E1C1AD8 for ; Tue, 18 Jun 2024 23:27:06 +0000 (UTC) X-FDA: 82245597252.19.8F607FA Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf30.hostedemail.com (Postfix) with ESMTP id 8346580002 for ; Tue, 18 Jun 2024 23:27:04 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718753220; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V/EvMUGX8lBU3QjX45BQlJi6IpIDFkRP1Em73pvzTkE=; b=gETIt1TJjHJ/bfdWiRxJ6hij18DNLmpm+UfycJeFhOfGuOPtItBpKIwqvHM6eszfplZOt+ qF61EIIlo7IdXc8AZqUAWtw0TUw+WDFUX/wfckmyympylf8W1BpSFJk9YKrPuMe4QJF3qO bkFZ/Y+DwXWtCKGXV+qYOAVV+p1aGDU= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718753220; a=rsa-sha256; cv=none; b=FSyzBvVlUN10ZkNhQ2CUDeBZ2OByCyDoKvKHM6RR2WISH6QjLwPh4QnDZzDyFVpzK3RRqH OOs+ZUOVC1vFoFrO3A3Csn2J2W1It498pr6rRqoBgKeUIJ1yb/g37rZctDMzZ37FtwfkO1 0/Wd3R4pd7hALMHhOaf0C3IH+TRgcjk= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5D21A1477; Tue, 18 Jun 2024 16:27:28 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id ED9C93F64C; Tue, 18 Jun 2024 16:27:01 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Chris Li , Kairui Song , "Huang, Ying" , Kalesh Singh , Barry Song , Hugh Dickins , David Hildenbrand Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v1 3/5] mm: swap: Track allocation order for clusters Date: Wed, 19 Jun 2024 00:26:43 +0100 Message-ID: <20240618232648.4090299-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240618232648.4090299-1-ryan.roberts@arm.com> References: <20240618232648.4090299-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 8346580002 X-Stat-Signature: 7rstpoyjzhswrfrqsuns96e59j978g1d X-Rspam-User: X-HE-Tag: 1718753224-624551 X-HE-Meta: U2FsdGVkX19j8VciIBrQmwXXKC3WWH6sLQhmRbnhnO2rw0Lo3PVkMbuL9mYnGe6kDS2CaWUoclF+NTc9pz4pD7PvtSn8VL5GnARr3EeLXB9cFOhXHpwyaROG8b+MdRa53yr501L8b1mhYfd+mSmR53ESKKcDnSEh3iwzBLguqK3lZPe++HLrqaYlxIY/SIZesc9eYfXjEyksv2bhghqKEbDeR3JEJaHyDTqqrYjrU3rdHYiKzDKef3IpiEUgMgQZXNSQcLSE6MY7izQj10WviYr7rwBq1bfnktUrLNi/VNbGhK2/worq/6NT5PyZn2/rU9t4M3hR8HgoDRvm5vHTV+TLV1M7/qzkvZP3R9nPBsCaPk+Qf7MHCjN6vi4TYl1tQDwmJI/t5B9K39GA8lKCQzty0eRDZ0JEs2BMBDAtT06ihi+fNS6tTwFVqkQVkSh2D7mohgxzWLinfNWOcdmd4pTMiEnK2Ho14xCH4oqhEaMxzzIixXTz7EgMfebfO/9whjZe2zCY6Iqz3LWbuOs842dfk7epcmSiIV0Wl0XqjpjByfll1AzD9UsXaz/d+wajmoUbG9EHWfXt0bBKjzS1uj0EBEoj8bV1+paTXIFfa96YZ/JXADzTy36xfnfvpEj3gFBjnQ7oW/XvYUK4Xmf3a/FqXASI5NHOPFatVX159C5q+7NI8gByYHSQNK4RkMK9H6x8Yrpf8QDWXWOenFIcm/ujt9bDXWpFIra2E8ZNXdYfDm60IHAgSIBKvia+pH6CHdXfBb+f0i5RZN3O1boc7jIKUMrv+jixVy14AOqOicIrPlz9OmYpM7qIXhMJ0NMC7i8f+ptm3/snzU7pU+Y2MPIhBoST+elWtjNQmLrFeaPl/JMF1sK+zfaK3ebnS9UrQk1bZvXkZvhyArp95q8r1Y8rAWsqv4wrJohS7SwNJxS8j8saF0UuQ6xCq6y9yHSD0/75tC+np/Ai+zQp0Pr bxNamC0Y aTnQ7rc26TvoeWSMfQE5EGE/2+L6PWMuA9Kx4T6AsdSLkwXWtqeWmVqtZIwlt5OKOndYo7ANg/8TAv4wpVLntOBwscx2wqdAjsioXyiVgOoM0sDuTXSUuaQf8eG3mv+X8qgpBrM4GqZwJFAkqZfDPO34XXw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add an `order` field to `struct swap_cluster_info`, which applies to allocated clusters (i.e. those not on the free list) and tracks the swap entry order that the cluster should be used to allocate. A future commit will use this information to scan partially filled clusters to find appropriate free swap entries for allocation. Note that it is still possible that order-0 swap entries will be allocated in clusters that indicate a higher order due to the order-0 scanning mechanism. The maximum order we ever expect to see is 13 - PMD-size on arm64 with 64K base pages. 13 fits into 4 bits, so let's steal 4 unused flags bits for this purpose to avoid making `struct swap_cluster_info` any bigger. Signed-off-by: Ryan Roberts --- include/linux/swap.h | 3 ++- mm/swapfile.c | 24 +++++++++++++++--------- 2 files changed, 17 insertions(+), 10 deletions(-) -- 2.43.0 diff --git a/include/linux/swap.h b/include/linux/swap.h index 66566251ba31..2a40fe02d281 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -255,7 +255,8 @@ struct swap_cluster_info { * cluster */ unsigned int data:24; - unsigned int flags:8; + unsigned int flags:4; + unsigned int order:4; }; #define CLUSTER_FLAG_FREE 1 /* This cluster is free */ #define CLUSTER_FLAG_NEXT_NULL 2 /* This cluster has no next cluster */ diff --git a/mm/swapfile.c b/mm/swapfile.c index 30e79739dfdc..7b13f02a7ac2 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -307,11 +307,13 @@ static inline void cluster_set_count(struct swap_cluster_info *info, info->data = c; } -static inline void cluster_set_count_flag(struct swap_cluster_info *info, - unsigned int c, unsigned int f) +static inline void cluster_set_count_flag_order(struct swap_cluster_info *info, + unsigned int c, unsigned int f, + unsigned int o) { info->flags = f; info->data = c; + info->order = o; } static inline unsigned int cluster_next(struct swap_cluster_info *info) @@ -330,6 +332,7 @@ static inline void cluster_set_next_flag(struct swap_cluster_info *info, { info->flags = f; info->data = n; + info->order = 0; } static inline bool cluster_is_free(struct swap_cluster_info *info) @@ -346,6 +349,7 @@ static inline void cluster_set_null(struct swap_cluster_info *info) { info->flags = CLUSTER_FLAG_NEXT_NULL; info->data = 0; + info->order = 0; } static inline struct swap_cluster_info *lock_cluster(struct swap_info_struct *si, @@ -521,13 +525,14 @@ static void swap_users_ref_free(struct percpu_ref *ref) complete(&si->comp); } -static void alloc_cluster(struct swap_info_struct *si, unsigned long idx) +static void alloc_cluster(struct swap_info_struct *si, unsigned long idx, + int order) { struct swap_cluster_info *ci = si->cluster_info; VM_BUG_ON(cluster_list_first(&si->free_clusters) != idx); cluster_list_del_first(&si->free_clusters, ci); - cluster_set_count_flag(ci + idx, 0, 0); + cluster_set_count_flag_order(ci + idx, 0, 0, order); } static void free_cluster(struct swap_info_struct *si, unsigned long idx) @@ -556,14 +561,15 @@ static void free_cluster(struct swap_info_struct *si, unsigned long idx) */ static void add_cluster_info_page(struct swap_info_struct *p, struct swap_cluster_info *cluster_info, unsigned long page_nr, - unsigned long count) + int order) { unsigned long idx = page_nr / SWAPFILE_CLUSTER; + unsigned long count = 1 << order; if (!cluster_info) return; if (cluster_is_free(&cluster_info[idx])) - alloc_cluster(p, idx); + alloc_cluster(p, idx, order); VM_BUG_ON(cluster_count(&cluster_info[idx]) + count > SWAPFILE_CLUSTER); cluster_set_count(&cluster_info[idx], @@ -577,7 +583,7 @@ static void add_cluster_info_page(struct swap_info_struct *p, static void inc_cluster_info_page(struct swap_info_struct *p, struct swap_cluster_info *cluster_info, unsigned long page_nr) { - add_cluster_info_page(p, cluster_info, page_nr, 1); + add_cluster_info_page(p, cluster_info, page_nr, 0); } /* @@ -964,7 +970,7 @@ static int scan_swap_map_slots(struct swap_info_struct *si, goto done; } memset(si->swap_map + offset, usage, nr_pages); - add_cluster_info_page(si, si->cluster_info, offset, nr_pages); + add_cluster_info_page(si, si->cluster_info, offset, order); unlock_cluster(ci); swap_range_alloc(si, offset, nr_pages); @@ -1060,7 +1066,7 @@ static void swap_free_cluster(struct swap_info_struct *si, unsigned long idx) ci = lock_cluster(si, offset); memset(si->swap_map + offset, 0, SWAPFILE_CLUSTER); - cluster_set_count_flag(ci, 0, 0); + cluster_set_count_flag_order(ci, 0, 0, 0); free_cluster(si, idx); unlock_cluster(ci); swap_range_free(si, offset, SWAPFILE_CLUSTER); From patchwork Tue Jun 18 23:26:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13703160 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8A67C27C4F for ; Tue, 18 Jun 2024 23:27:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9F2916B0150; Tue, 18 Jun 2024 19:27:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 928546B03DC; Tue, 18 Jun 2024 19:27:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F74B6B03DD; Tue, 18 Jun 2024 19:27:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 5F8706B0150 for ; Tue, 18 Jun 2024 19:27:08 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 0E4C980B26 for ; Tue, 18 Jun 2024 23:27:08 +0000 (UTC) X-FDA: 82245597336.10.B58C300 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf17.hostedemail.com (Postfix) with ESMTP id 624D640013 for ; Tue, 18 Jun 2024 23:27:06 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf17.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718753218; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=n5EVA+yQB0nUNRhHDs/asI24U++slgQ51GqF2drh+fg=; b=GP9wB1WTCuSal5ejXRzuL8TYY4Rn3dJ/drV1uVVGMGzxfC3TObyR7IVuvzYZlOTYIkX2Ng YCR7y+7C3DCVhA018yZi0zCXmTsVsd0UEhBbWAHQlvM8NrNc1hy2pIG3/SDuZib3Nznc6k GPsB5BiPGho3zxmDucFqRibLriiSFh0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718753218; a=rsa-sha256; cv=none; b=PrcNCjdmUW4VG5KNgpguqp5xOk1takyPjbn2ZOlmjcxz7D9hBfJtPKiqHeAwjNjoVa3LXJ N0+LnKu4teBmclVwMmNkFUBzDuexcNX8QOSKcSgwUMjRzU4g7n3RokFvEZJOnAWDKkxqWB h4j3QrAF90a0INj8ZOZ6IgtQVZNuJcs= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf17.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5AD4014BF; Tue, 18 Jun 2024 16:27:30 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id ECDFD3F64C; Tue, 18 Jun 2024 16:27:03 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Chris Li , Kairui Song , "Huang, Ying" , Kalesh Singh , Barry Song , Hugh Dickins , David Hildenbrand Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v1 4/5] mm: swap: Scan for free swap entries in allocated clusters Date: Wed, 19 Jun 2024 00:26:44 +0100 Message-ID: <20240618232648.4090299-5-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240618232648.4090299-1-ryan.roberts@arm.com> References: <20240618232648.4090299-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 624D640013 X-Stat-Signature: jsrwbz1d3j1oi81enern4banzit4g7u3 X-Rspam-User: X-HE-Tag: 1718753226-752033 X-HE-Meta: U2FsdGVkX1+0nPpWzkKtJP2MtZ0/rEf2V7dewdRX+LCqT1W/z5x6BuyjcOXqDUS1KHA2fGU0dgjdiDLnzeLbd1FuLdCNTYpCbojj+jE9sgzH9MiNvFashDmgJe+JqBapMPXrU9iVbYHg/D48FXK6Q22x1U9V8rw2PxHRh0n7ERYigjho0GNVu55oSZ1VxVtCpaVNcazIdZDhITnc5AAuh1csmpkGNz6Q3rwFD1S+BpHMytad1SRVlQ2jHrbdyMuBV9onttIHw3crYYalH8IqaqtDpzikIuRkgjr3NRZ6dM8xeW7q+yjojdbhSIXbEn3HkfjTJz9T4ayuwwC/0J2gnjO1gNn7KYjC4egoNIqKgsbcW/Lxr6+sTpId/xeWI4xwTrYEZgstXYL7MCqJKkgzS70AIXenc2vqgJhOrMdvhhnVqam0cqt0wNzpIJ7Vw/0G5y/bOrImxsgCBugXjM0qrYsEQyJKr/7VNpXuT5mJWJaXlVKFrkbMMh0c6RJ/LUHJ7nDj4vygHbZItF2WJ8fYwoFij7upFT3QtGCvI5lTDWJ7edhSJtm0FWJRu9Iknb07Kw6ot5pGox3G1BBCW+u1jCr7KxGmJZ6VcTq26IeY8YhLUHRWHhV4q5FkSgi7y17N4hERL8nBzm77BOBVJyvi3gKGMDuPHciwyXMwiMJm0uXHm9hq8KofpMxvuFFt26qN8rJuVbO94fexnng/UcTV/5Ro7RLNAmIameWeAvEPpTUna1Ea4/9VNZveObolRk++MVjlyMLOBDlt/QPgYkzbhajEKY11wXETn2VdCFgxcDxJRhQKwitQTle+WZUSnJZSFEoqLsTF0ERomdjdv0yAGg2LOa+GEaEbCSRfyAEwo06D16p5cxq0EtXyrJ7hClK0dxowdKFuItLkg6VoN8nFJxg5hga9BoVl1cU8PmUhYjQfbWhGR49LJ5nQ/cwqzmaLKVB5Uj2x8fiX5nrQpFa keVcS9n4 WCaqsnPugvg9HTIsBHq+q7zHer2Mp+gerJhnBAGIDdoVATieNO10gZLDLuYk2Td5KHVn8QB8uZ2nH+Y2/YlishekG/+gKtEHIaD56IaDXeSbhXLfSRXb/ZeHTveNI+A+WJwDq0TU5wue1UIfCvE73SYoQxGfE1lWM+RQec3M9/yTTVDC5uKei2cuSla1XHN6I85FCgoR3XSzW1yteratZB+ZzeExocp8/qHLhyyFBfsg7dTY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Previously mTHP would only be swapped out if a CPU could allocate itself a free cluster from which to allocate mTHP-sized contiguous swap entry blocks. But for a system making heavy use of swap, after a while fragmentation ensures there are no available free clusters and therefore the swap entry allocation fails and forces the mTHP to be split to base pages which then get swap entries allocated by scanning the swap file for free individual pages. But when swap entries are freed, this makes holes in the clusters, and often it would be possible to allocate new mTHP swap entries in those holes. So if we fail to allocate a free cluster, scan through the clusters until we find one that is in use and contains swap entries of the order we require. Then scan it until we find a suitably sized and aligned hole. We keep a per-order "next cluster to scan" pointer so that future scanning can be picked up from where we last left off. And if we scan through all clusters without finding a suitable hole, we give up to prevent live lock. Running the test case provided by Barry Song at the below link, I can see swpout fallback rate, which was previously 100% after a few iterations, falls to 0% and stays there for all 100 iterations. This is also the case when sprinkling in some non-mTHP allocations ("-s") too. Signed-off-by: Ryan Roberts Link: https://lore.kernel.org/linux-mm/20240615084714.37499-1-21cnbao@gmail.com/ --- include/linux/swap.h | 2 + mm/swapfile.c | 90 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 92 insertions(+) -- 2.43.0 diff --git a/include/linux/swap.h b/include/linux/swap.h index 2a40fe02d281..34ec4668a5c9 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -310,6 +310,8 @@ struct swap_info_struct { unsigned int cluster_nr; /* countdown to next cluster search */ unsigned int __percpu *cluster_next_cpu; /*percpu index for next allocation */ struct percpu_cluster __percpu *percpu_cluster; /* per cpu's swap location */ + struct swap_cluster_info *next_order_scan[SWAP_NR_ORDERS]; + /* Start cluster for next order-based scan */ struct rb_root swap_extent_root;/* root of the swap extent rbtree */ struct block_device *bdev; /* swap device or bdev of swap file */ struct file *swap_file; /* seldom referenced */ diff --git a/mm/swapfile.c b/mm/swapfile.c index 7b13f02a7ac2..24db03db8830 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -644,6 +644,84 @@ static inline bool swap_range_empty(char *swap_map, unsigned int start, return true; } +static inline +struct swap_cluster_info *offset_to_cluster(struct swap_info_struct *si, + unsigned int offset) +{ + VM_WARN_ON(!si->cluster_info); + return si->cluster_info + (offset / SWAPFILE_CLUSTER); +} + +static inline +unsigned int cluster_to_offset(struct swap_info_struct *si, + struct swap_cluster_info *ci) +{ + VM_WARN_ON(!si->cluster_info); + return (ci - si->cluster_info) * SWAPFILE_CLUSTER; +} + +static inline +struct swap_cluster_info *next_cluster_circular(struct swap_info_struct *si, + struct swap_cluster_info *ci) +{ + struct swap_cluster_info *last; + + /* + * Wrap after the last whole cluster; never return the final partial + * cluster because users assume an entire cluster is accessible. + */ + last = offset_to_cluster(si, si->max) - 1; + return ci == last ? si->cluster_info : ++ci; +} + +static inline +struct swap_cluster_info *prev_cluster_circular(struct swap_info_struct *si, + struct swap_cluster_info *ci) +{ + struct swap_cluster_info *last; + + /* + * Wrap to the last whole cluster; never return the final partial + * cluster because users assume an entire cluster is accessible. + */ + last = offset_to_cluster(si, si->max) - 1; + return ci == si->cluster_info ? last : --ci; +} + +/* + * Returns the offset of the next cluster, allocated to contain swap entries of + * `order`, that is eligible to scan for free space. On first call, *stop should + * be set to SWAP_NEXT_INVALID to indicate the clusters should be scanned all + * the way back around to the returned cluster. The function updates *stop upon + * first call and consumes it in subsequent calls. Returns SWAP_NEXT_INVALID if + * no such clusters are available. Must be called with si lock held. + */ +static unsigned int next_cluster_for_scan(struct swap_info_struct *si, + int order, unsigned int *stop) +{ + struct swap_cluster_info *ci; + struct swap_cluster_info *end; + + ci = si->next_order_scan[order]; + if (*stop == SWAP_NEXT_INVALID) + *stop = cluster_to_offset(si, prev_cluster_circular(si, ci)); + end = offset_to_cluster(si, *stop); + + while (ci != end) { + if ((ci->flags & CLUSTER_FLAG_FREE) == 0 && ci->order == order) + break; + ci = next_cluster_circular(si, ci); + } + + if (ci == end) { + si->next_order_scan[order] = ci; + return SWAP_NEXT_INVALID; + } + + si->next_order_scan[order] = next_cluster_circular(si, ci); + return cluster_to_offset(si, ci); +} + /* * Try to get swap entries with specified order from current cpu's swap entry * pool (a cluster). This might involve allocating a new cluster for current CPU @@ -656,6 +734,7 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, struct percpu_cluster *cluster; struct swap_cluster_info *ci; unsigned int tmp, max; + unsigned int stop = SWAP_NEXT_INVALID; new_cluster: cluster = this_cpu_ptr(si->percpu_cluster); @@ -674,6 +753,15 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, *scan_base = this_cpu_read(*si->cluster_next_cpu); *offset = *scan_base; goto new_cluster; + } else if (nr_pages < SWAPFILE_CLUSTER) { + /* + * There is no point in scanning for free areas the same + * size as the cluster, since the cluster would have + * already been freed in that case. + */ + tmp = next_cluster_for_scan(si, order, &stop); + if (tmp == SWAP_NEXT_INVALID) + return false; } else return false; } @@ -2392,6 +2480,8 @@ static void setup_swap_info(struct swap_info_struct *p, int prio, } p->swap_map = swap_map; p->cluster_info = cluster_info; + for (i = 0; i < SWAP_NR_ORDERS; i++) + p->next_order_scan[i] = cluster_info; } static void _enable_swap_info(struct swap_info_struct *p) From patchwork Tue Jun 18 23:26:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13703161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C9CEC2BA1A for ; Tue, 18 Jun 2024 23:27:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 731066B03DC; Tue, 18 Jun 2024 19:27:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E2BD6B03DE; Tue, 18 Jun 2024 19:27:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5A62F6B03DF; Tue, 18 Jun 2024 19:27:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 3CE7E6B03DC for ; Tue, 18 Jun 2024 19:27:10 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id DD571120B90 for ; Tue, 18 Jun 2024 23:27:09 +0000 (UTC) X-FDA: 82245597378.06.E35FD1E Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf05.hostedemail.com (Postfix) with ESMTP id 497E8100010 for ; Tue, 18 Jun 2024 23:27:08 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=none; spf=pass (imf05.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718753220; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2OZMpj7h1Pl3Q6o02DdhVknVwDf166qA3maGyAVOgBw=; b=ZvI/fzZ3M5iYv6SAJtDV9iIjF6NjHENbZyF1I5vmfwoGzOS37OGdAJZ1tH9y5tQbRDBVyk p3f+FhQcUeg17VHaNVIrV+Yd0fEoNApLhVksFRL1ZGSuu7HRN+97LtbcIkT+72DMCWI21/ SKLzv++Yfkp0Xk5k+wQd4Mhvd2mBU9c= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718753220; a=rsa-sha256; cv=none; b=WU6S6zgChts1bK/s4zGAkDbbEkcQSr80fltl31lO6s4EZ0zhk9fpTawtlgLqJYoQ4VGag0 PBQid4auWjdJkm81yb+CDrL4cB0xRykKj0xlqI1qyjevPehX0BeuGdGai3rory9FRjccDA pNK1qjUT4fGnnx+PoOyhclXKiGSbvJE= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; spf=pass (imf05.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5A856150C; Tue, 18 Jun 2024 16:27:32 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EC8073F64C; Tue, 18 Jun 2024 16:27:05 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Chris Li , Kairui Song , "Huang, Ying" , Kalesh Singh , Barry Song , Hugh Dickins , David Hildenbrand Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v1 5/5] mm: swap: Optimize per-order cluster scanning Date: Wed, 19 Jun 2024 00:26:45 +0100 Message-ID: <20240618232648.4090299-6-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240618232648.4090299-1-ryan.roberts@arm.com> References: <20240618232648.4090299-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Stat-Signature: 4xzy1on8y8rtq7do3nuer1qzadgw381o X-Rspamd-Queue-Id: 497E8100010 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1718753228-652340 X-HE-Meta: U2FsdGVkX1/016TaAjl2z8+ZbPqnEJNbXkumofjAjLSXRGaXHFjp4WGbwCQ1vj89rdjkTJQq29Y+DtEmMKs6LguHswEJ+EcetZCoi/orhEHlJjl+YTzhbO+VcDAZYTXIq8etqbOgd/rSbkxU+dhAl4u7PBmlWlKqasMva7gdkl9Qer+HU0aP6j4Q3V2e9VvIN0SckVwRhtgH4m8mHE7qh6whzz8/M19Zc64Da8f1jXT+GYRLy3FtPuCGxgur1AE9sbJCCqawk6kpc2GWbNxnzWVwXR7wx+3bStgfjEcLQbHwB5rZKWyHiojZYQgC1ImkNLv3hKjSDQCzFknzhBmZzZFkayJua08jEIK/XbhbR9s8hrHh/wi+WPjNkBKvlg4miDSus+L9lWgRx1BtPgx06GXeREkxn7apB38L7Yu0Cwwk7zBGXqJ2dYfEMKd2eh4sfGwK7tvPvE37/uZSzjo/GqIdgFT2aPaHW/bldBSKK7HIDdYbx28yXCHy2nR6nuV8h+keNl1XrcS6bCmA9XKzKF+7C//gMNGIi49RJpmLportsfSw6FVWpukmISDfjBqWgKdSphqzaOQse6qiy8iUm8mv0tyQYFn7EOPR6iAcSqg4IeHlqdpnouH09Mi+ZlSvqK0CmjC/tcBut4vfEuCjc3XgTtxTt5Y356pVcAV9taXV1MXILZYBLAhGE9KLzgtUXKWJU1K2gzJhzylYqQfANr64UFxgbBbmCPHiaMBD7vWzCuyWjFcRTBzgtFfjYV/KpDJ2RH9s0uMTauVrgjE7CsDlNfPXzNOKsAnWzzHCbM1mEE4hTJHkvnT7E58BX43FSBk2sFst671DwgOH2c+uqS70Qne7SiC0o48RIPRwcLHEL9v403c4GUEJJzkePL/1ywMSv4Z836+1VD1nV+87CvZYrAUF6rCM18rK8J0nE4RWoCzoXXbT9WUyqwPzijt0MAm+tVvkJVoBGITAMsx YcUX/TRp jP97leQv6rBgijtGzRVrfxuGE0XOON5xA/5PFCbDuV88CR7B+IGf7A2d04U+1pD1++3rYNUfjBAjVjSAaZ8p3y/YkoJP+yIr7TxaUCqYAXl9hy2KAqp5TUFLMhln7Q9/3J4tK3bV5d0lk1drZUD4i7WFCVQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add CLUSTER_FLAG_SKIP_SCAN cluster flag, which is applied to a cluster under 1 of 2 conditions. When present, the cluster will be skipped during a scan. - When the number of free entries is less than the number of entries that would be required for a new allocation of the order that the cluster serves. - When scanning completes for the cluster, and no further scanners are active for the cluster and no swap entries were freed for the cluster since the last scan began. In this case, it has been proven that there are no contiguous free entries of sufficient size to allcoate the order that the cluster serves. In this case the cluster is made eligible for scanning again when the next entry is freed. The latter is implemented to permit multiple CPUs to scan the same cluster, which in turn garrantees that if there is a free block available in a cluster allocated for the desired order then it will be allocated on a first come, first served basis. As a result, the number of active scanners for a cluster must be tracked, costing 4 bytes per cluster. Signed-off-by: Ryan Roberts --- include/linux/swap.h | 3 +++ mm/swapfile.c | 36 ++++++++++++++++++++++++++++++++++-- 2 files changed, 37 insertions(+), 2 deletions(-) -- 2.43.0 diff --git a/include/linux/swap.h b/include/linux/swap.h index 34ec4668a5c9..40c308749e79 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -257,9 +257,12 @@ struct swap_cluster_info { unsigned int data:24; unsigned int flags:4; unsigned int order:4; + unsigned int nr_scanners; }; #define CLUSTER_FLAG_FREE 1 /* This cluster is free */ #define CLUSTER_FLAG_NEXT_NULL 2 /* This cluster has no next cluster */ +#define CLUSTER_FLAG_SKIP_SCAN 4 /* Skip cluster for per-order scan */ +#define CLUSTER_FLAG_DECREMENT 8 /* A swap entry was freed from cluster */ /* * swap_info_struct::max is an unsigned int, so the maximum number of pages in diff --git a/mm/swapfile.c b/mm/swapfile.c index 24db03db8830..caf382b4ecd3 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -574,6 +574,9 @@ static void add_cluster_info_page(struct swap_info_struct *p, VM_BUG_ON(cluster_count(&cluster_info[idx]) + count > SWAPFILE_CLUSTER); cluster_set_count(&cluster_info[idx], cluster_count(&cluster_info[idx]) + count); + + if (SWAPFILE_CLUSTER - cluster_count(&cluster_info[idx]) < count) + cluster_info[idx].flags |= CLUSTER_FLAG_SKIP_SCAN; } /* @@ -595,6 +598,7 @@ static void dec_cluster_info_page(struct swap_info_struct *p, struct swap_cluster_info *cluster_info, unsigned long page_nr) { unsigned long idx = page_nr / SWAPFILE_CLUSTER; + unsigned long count = 1 << cluster_info[idx].order; if (!cluster_info) return; @@ -603,6 +607,10 @@ static void dec_cluster_info_page(struct swap_info_struct *p, cluster_set_count(&cluster_info[idx], cluster_count(&cluster_info[idx]) - 1); + cluster_info[idx].flags |= CLUSTER_FLAG_DECREMENT; + if (SWAPFILE_CLUSTER - cluster_count(&cluster_info[idx]) >= count) + cluster_info[idx].flags &= ~CLUSTER_FLAG_SKIP_SCAN; + if (cluster_count(&cluster_info[idx]) == 0) free_cluster(p, idx); } @@ -708,7 +716,8 @@ static unsigned int next_cluster_for_scan(struct swap_info_struct *si, end = offset_to_cluster(si, *stop); while (ci != end) { - if ((ci->flags & CLUSTER_FLAG_FREE) == 0 && ci->order == order) + if ((ci->flags & (CLUSTER_FLAG_SKIP_SCAN | CLUSTER_FLAG_FREE)) == 0 + && ci->order == order) break; ci = next_cluster_circular(si, ci); } @@ -722,6 +731,21 @@ static unsigned int next_cluster_for_scan(struct swap_info_struct *si, return cluster_to_offset(si, ci); } +static inline void cluster_inc_scanners(struct swap_cluster_info *ci) +{ + /* Protected by si lock. */ + ci->nr_scanners++; + ci->flags &= ~CLUSTER_FLAG_DECREMENT; +} + +static inline void cluster_dec_scanners(struct swap_cluster_info *ci) +{ + /* Protected by si lock. */ + ci->nr_scanners--; + if (ci->nr_scanners == 0 && (ci->flags & CLUSTER_FLAG_DECREMENT) == 0) + ci->flags |= CLUSTER_FLAG_SKIP_SCAN; +} + /* * Try to get swap entries with specified order from current cpu's swap entry * pool (a cluster). This might involve allocating a new cluster for current CPU @@ -764,6 +788,8 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, return false; } else return false; + + cluster_inc_scanners(offset_to_cluster(si, tmp)); } /* @@ -780,13 +806,19 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, } unlock_cluster(ci); if (tmp >= max) { + cluster_dec_scanners(ci); cluster->next[order] = SWAP_NEXT_INVALID; goto new_cluster; } *offset = tmp; *scan_base = tmp; tmp += nr_pages; - cluster->next[order] = tmp < max ? tmp : SWAP_NEXT_INVALID; + if (tmp >= max) { + cluster_dec_scanners(ci); + cluster->next[order] = SWAP_NEXT_INVALID; + } else { + cluster->next[order] = tmp; + } return true; }