From patchwork Wed Apr 3 11:40:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13616011 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11998CD128A for ; Wed, 3 Apr 2024 11:40:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E24A6B0095; Wed, 3 Apr 2024 07:40:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 143B66B0098; Wed, 3 Apr 2024 07:40:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC6726B0099; Wed, 3 Apr 2024 07:40:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id AF7D26B0095 for ; Wed, 3 Apr 2024 07:40:51 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7E377120E62 for ; Wed, 3 Apr 2024 11:40:51 +0000 (UTC) X-FDA: 81968028702.04.67FFC26 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id C53231C000A for ; Wed, 3 Apr 2024 11:40:49 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712144449; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lLnS65r34cYKawtsiQxCD4njUADJQ57Av5UsFY/XsZs=; b=LOnONPXOkK6paAnxUz1kc1zI6NWiRyCCDER9+6+HQzvCP/z3UpcSGYyXZWxGU14cVNCLeK bWUUg3R2u0lTu4jktUa80MWq87k3Uz8a5dSbrbjpi0vjWgJETkPJAwWRs3TdHew1lcCLtR dGfXGJVMuoFiLQdxw5K+kSAAkkcZWeg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712144449; a=rsa-sha256; cv=none; b=S09uJ5MoD0PqIJpwLOoycEsp+Ew4ixehkKKa7v1uA4/o+LN23NkEfB3MF73lnY9nh+qLD+ wASNL/CSmgBeQB+3HOfOprTfEsmP0v1lNsuZ1pkGZ1BznAuL91WY8fMX838mxDfyKM0CB2 h78N3vLFmvlSGtSsE7/5MLURsrlS6q0= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1C97E1650; Wed, 3 Apr 2024 04:41:20 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2A0113F64C; Wed, 3 Apr 2024 04:40:47 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , David Hildenbrand , Matthew Wilcox , Huang Ying , Gao Xiang , Yu Zhao , Yang Shi , Michal Hocko , Kefeng Wang , Barry Song <21cnbao@gmail.com>, Chris Li , Lance Yang Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v6 3/6] mm: swap: Simplify struct percpu_cluster Date: Wed, 3 Apr 2024 12:40:29 +0100 Message-Id: <20240403114032.1162100-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240403114032.1162100-1-ryan.roberts@arm.com> References: <20240403114032.1162100-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Stat-Signature: tifr8ni875xpb3ukuxm1tncguk699z46 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: C53231C000A X-Rspam-User: X-HE-Tag: 1712144449-653229 X-HE-Meta: U2FsdGVkX18JgMQeIB2JHsafU5Vvdwo1JOaY6t7Nr+lIl0BacJE7FanYoYMXboM2767Di+X/yaRM53ew/vgIeZaAY0sATs5huS6qxTpk1Na6/DzX8ds4qkmFUbHXaW51NpVS+O1bdY45i9K7P0cejLlllRelBHe0KWO6F1UEKE4Y18qUcsNxkE00j5rwRAoxKrLrmGbd49SI1cvRmPM7cMJe7B+9GywDkyy9skVXpm+3iOiAETWmJrX8vf7/wbDDsSsCXP+8Zj1gCFyJktlAdMeVObfuZCgfPo094K2aKyazt4iC2XoPHBzTeSRdWgJpHW8YQbRDCVf1CsWG9bj1INTOLm3Cihr8+ZtPMuQAWR7FkUuhkckim9CK8AaQNc9SXIQpn28VGRi8xJw/HDBww925Tq4cLkhFDLs+7xm4a9Cz2DQxi+ksTHUoiNSNl2ThQJGlOBawseR7GRd4j8tx1MAivZIrPItYh+CBlvcJ82tCiP/07yuKlGjwti7kPnGh7Pohyex/C6Fb9AgmFDLuQAEh0e0yXMRpYmSM3fROJcYAQKRHORZLnVAD+X8htpcYAoTXz4p4YInA5b02Rc4Fm9Dx6RnpivJPlikQ1K5/h07Avg5CNsPfUkM7OIbF1Qj2Ah3+kh3DsNlKmY12/2nhJJPWfaaqDTfOV8nR7zTutvE4ukOyIPB7SdDyQm8jL2xVio/h9bZabi99+vs9/B8oNdkEmYuMxXeDN5fzfkPnOAv9YohAiJk0RQbJd3wsyYfQud6PycuAtg5mwgmjK+wqZJrLyBqc4JGAVBfjJBDsQ1ncyDC7lqkVAC1Xq3rC1sFzuOintur87kc6QZSwmUZhCNpq/E0nFJT+gmSh4aT6j+xOu8RrysDmHcWtRH5h9BTRwpwcmiw4iy8sTKvfgQR8o+mtDoS6OAJUHn0ub8ngzdO50Gc5meF+P7nK1Dee8vYJVNtyGfJKr3FKmvryfDn z9hNdzQI VJnx9pULky7RntAwftPmq+ffQ+uWrMu7dzGz/qMY9DUf9TM1U6FoyGiqz1vPXlsW/MTquYZtbMjYeQXB4zMwv/ydHc1GdXeahFg/swxwvzlIrnZ2otjhRFr4OboqCW/a2K+7phlgNyxaY5Q6nJCq4wzDkglv5A9qKq7MRngki+zjh0NmaCkpLAvudff1N/V0+NhpCYkVilE7JGAURwiGZURYylikuB25+Q6DU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: struct percpu_cluster stores the index of cpu's current cluster and the offset of the next entry that will be allocated for the cpu. These two pieces of information are redundant because the cluster index is just (offset / SWAPFILE_CLUSTER). The only reason for explicitly keeping the cluster index is because the structure used for it also has a flag to indicate "no cluster". However this data structure also contains a spin lock, which is never used in this context, as a side effect the code copies the spinlock_t structure, which is questionable coding practice in my view. So let's clean this up and store only the next offset, and use a sentinal value (SWAP_NEXT_INVALID) to indicate "no cluster". SWAP_NEXT_INVALID is chosen to be 0, because 0 will never be seen legitimately; The first page in the swap file is the swap header, which is always marked bad to prevent it from being allocated as an entry. This also prevents the cluster to which it belongs being marked free, so it will never appear on the free list. This change saves 16 bytes per cpu. And given we are shortly going to extend this mechanism to be per-cpu-AND-per-order, we will end up saving 16 * 9 = 144 bytes per cpu, which adds up if you have 256 cpus in the system. Reviewed-by: "Huang, Ying" Signed-off-by: Ryan Roberts --- include/linux/swap.h | 9 ++++++++- mm/swapfile.c | 22 +++++++++++----------- 2 files changed, 19 insertions(+), 12 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 5737236dc3ce..5e1e4f5bf0cb 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -260,13 +260,20 @@ struct swap_cluster_info { #define CLUSTER_FLAG_FREE 1 /* This cluster is free */ #define CLUSTER_FLAG_NEXT_NULL 2 /* This cluster has no next cluster */ +/* + * The first page in the swap file is the swap header, which is always marked + * bad to prevent it from being allocated as an entry. This also prevents the + * cluster to which it belongs being marked free. Therefore 0 is safe to use as + * a sentinel to indicate next is not valid in percpu_cluster. + */ +#define SWAP_NEXT_INVALID 0 + /* * We assign a cluster to each CPU, so each CPU can allocate swap entry from * its own cluster and swapout sequentially. The purpose is to optimize swapout * throughput. */ struct percpu_cluster { - struct swap_cluster_info index; /* Current cluster index */ unsigned int next; /* Likely next allocation offset */ }; diff --git a/mm/swapfile.c b/mm/swapfile.c index d059de6896c1..c95986b9cb9c 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -609,7 +609,7 @@ scan_swap_map_ssd_cluster_conflict(struct swap_info_struct *si, return false; percpu_cluster = this_cpu_ptr(si->percpu_cluster); - cluster_set_null(&percpu_cluster->index); + percpu_cluster->next = SWAP_NEXT_INVALID; return true; } @@ -622,14 +622,14 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, { struct percpu_cluster *cluster; struct swap_cluster_info *ci; - unsigned long tmp, max; + unsigned int tmp, max; new_cluster: cluster = this_cpu_ptr(si->percpu_cluster); - if (cluster_is_null(&cluster->index)) { + tmp = cluster->next; + if (tmp == SWAP_NEXT_INVALID) { if (!cluster_list_empty(&si->free_clusters)) { - cluster->index = si->free_clusters.head; - cluster->next = cluster_next(&cluster->index) * + tmp = cluster_next(&si->free_clusters.head) * SWAPFILE_CLUSTER; } else if (!cluster_list_empty(&si->discard_clusters)) { /* @@ -649,9 +649,7 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, * Other CPUs can use our cluster if they can't find a free cluster, * check if there is still free entry in the cluster */ - tmp = cluster->next; - max = min_t(unsigned long, si->max, - (cluster_next(&cluster->index) + 1) * SWAPFILE_CLUSTER); + max = min_t(unsigned long, si->max, ALIGN(tmp + 1, SWAPFILE_CLUSTER)); if (tmp < max) { ci = lock_cluster(si, tmp); while (tmp < max) { @@ -662,12 +660,13 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, unlock_cluster(ci); } if (tmp >= max) { - cluster_set_null(&cluster->index); + cluster->next = SWAP_NEXT_INVALID; goto new_cluster; } - cluster->next = tmp + 1; *offset = tmp; *scan_base = tmp; + tmp += 1; + cluster->next = tmp < max ? tmp : SWAP_NEXT_INVALID; return true; } @@ -3150,8 +3149,9 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) } for_each_possible_cpu(cpu) { struct percpu_cluster *cluster; + cluster = per_cpu_ptr(p->percpu_cluster, cpu); - cluster_set_null(&cluster->index); + cluster->next = SWAP_NEXT_INVALID; } } else { atomic_inc(&nr_rotate_swap);