[0/7] mm, swap: remove swap slot cache

Message ID	20250214175709.76029-1-ryncsn@gmail.com (mailing list archive)
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Kairui Song <ryncsn@gmail.com> To: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org>, Chris Li <chrisl@kernel.org>, Barry Song <v-songbaohua@oppo.com>, Hugh Dickins <hughd@google.com>, Yosry Ahmed <yosryahmed@google.com>, "Huang, Ying" <ying.huang@linux.alibaba.com>, Baoquan He <bhe@redhat.com>, Nhat Pham <nphamcs@gmail.com>, Johannes Weiner <hannes@cmpxchg.org>, Kalesh Singh <kaleshsingh@google.com>, linux-kernel@vger.kernel.org, Kairui Song <kasong@tencent.com> Subject: [PATCH 0/7] mm, swap: remove swap slot cache Date: Sat, 15 Feb 2025 01:57:02 +0800 Message-ID: <20250214175709.76029-1-ryncsn@gmail.com> Reply-To: Kairui Song <kasong@tencent.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	mm, swap: remove swap slot cache \| expand [0/7] mm, swap: remove swap slot cache [1/7] mm, swap: avoid reclaiming irrelevant swap cache [2/7] mm, swap: drop the flag TTRS_DIRECT [3/7] mm, swap: avoid redundant swap device pinning [4/7] mm, swap: don't update the counter up-front [5/7] mm, swap: use percpu cluster as allocation fast path [6/7] mm, swap: remove swap slot cache [7/7] mm, swap: simplify folio swap allocation

Message ID

20250214175709.76029-1-ryncsn@gmail.com (mailing list archive)

Headers

From: Kairui Song <ryncsn@gmail.com>
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Chris Li <chrisl@kernel.org>,
	Barry Song <v-songbaohua@oppo.com>,
	Hugh Dickins <hughd@google.com>,
	Yosry Ahmed <yosryahmed@google.com>,
	"Huang, Ying" <ying.huang@linux.alibaba.com>,
	Baoquan He <bhe@redhat.com>,
	Nhat Pham <nphamcs@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Kalesh Singh <kaleshsingh@google.com>,
	linux-kernel@vger.kernel.org,
	Kairui Song <kasong@tencent.com>
Subject: [PATCH 0/7] mm, swap: remove swap slot cache
Date: Sat, 15 Feb 2025 01:57:02 +0800
Message-ID: <20250214175709.76029-1-ryncsn@gmail.com>
Reply-To: Kairui Song <kasong@tencent.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

mm, swap: remove swap slot cache | expand

Message

Kairui Song Feb. 14, 2025, 5:57 p.m. UTC

From: Kairui Song <kasong@tencent.com>

Slot cache was initially introduced by commit 67afa38e012e ("mm/swap:
add cache for swap slots allocation") to reduce the lock contention
of si->lock.

Previous series "mm, swap: rework of swap allocator locks" [1] removed
swap slot cache for freeing path as freeing path no longer touches
si->lock in most cased. Allocation path also have slight to none
contention on si->lock since that series, but slot cache still helps
to reduce other overheads, like counters and the plist.

This series removes the slot cache from allocation path too, by using
the cluster as allocation fast path and also reduce other overheads.

Now slot cache is completely gone, the code is much simplified without
obvious feature or performance change, also clean up related workaround.
Also this should avoid other potential issues, e.g. the long pinning
of swap slots: swap slot cache pins swap slots with HAS_CACHE, causing
reclaim or allocation fail to use these slots on scanning.

The only behavior change is the swap device allocation rotation
mechanism, as explained in the patch "mm, swap: use percpu cluster
as allocation fast path".

Test results are looking good after deleting the swap slot cache:

- vm-scalability with: `usemem --init-time -O -y -x -R -31 1G`,
12G memory cgroup using simulated pmem as SWAP (32G pmem, 32 CPUs),
16 test runs for each case, measuring the total throughput:

                      Before (KB/s) (stdev)  After (KB/s) (stdev)
Random (4K):          424907.60 (24410.78)   414745.92  (34554.78)
Random (64K):         163308.82 (11635.72)   167314.50  (18434.99)
Sequential (4K, !-R): 6150056.79 (103205.90) 6321469.06 (115878.16)

- Build linux kernel with make -j96, using 4K folio with 1.5G memory
cgroup limit and 64K folio with 2G memory cgroup limit, on top of tmpfs,
12 test runs, measuring the system time:

                  Before (s) (stdev)  After (s) (stdev)
make -j96 (4K):   6445.69 (61.95)     6408.80 (69.46)
make -j96 (64K):  6841.71 (409.04)    6437.99 (435.55)

The performance is unchanged, slightly better in some cases.

[1] https://lore.kernel.org/linux-mm/20250113175732.48099-1-ryncsn@gmail.com/

Kairui Song (7):
  mm, swap: avoid reclaiming irrelevant swap cache
  mm, swap: drop the flag TTRS_DIRECT
  mm, swap: avoid redundant swap device pinning
  mm, swap: don't update the counter up-front
  mm, swap: use percpu cluster as allocation fast path
  mm, swap: remove swap slot cache
  mm, swap: simplify folio swap allocation

 include/linux/swap.h       |  21 +--
 include/linux/swap_slots.h |  28 ----
 mm/Makefile                |   2 +-
 mm/shmem.c                 |  21 +--
 mm/swap.h                  |   6 -
 mm/swap_slots.c            | 295 ----------------------------------
 mm/swap_state.c            |  79 ++--------
 mm/swapfile.c              | 315 ++++++++++++++++++-------------------
 mm/vmscan.c                |  16 +-
 mm/zswap.c                 |   6 +
 10 files changed, 196 insertions(+), 593 deletions(-)
 delete mode 100644 include/linux/swap_slots.h
 delete mode 100644 mm/swap_slots.c

Comments

Baoquan He Feb. 15, 2025, 10:27 a.m. UTC | #1

Hi Kairui,

On 02/15/25 at 01:57am, Kairui Song wrote:
> From: Kairui Song <kasong@tencent.com>
> 
> Slot cache was initially introduced by commit 67afa38e012e ("mm/swap:
> add cache for swap slots allocation") to reduce the lock contention
> of si->lock.

Thanks for adding me in CC. While I got failure to apply this series
to the latest mainline kernel, could you tell what is the base commit
of this pathcset?

Thanks
Baoquan

Kairui Song Feb. 15, 2025, 1:34 p.m. UTC | #2

On Sat, Feb 15, 2025 at 6:28 PM Baoquan He <bhe@redhat.com> wrote:
>
> Hi Kairui,
>
> On 02/15/25 at 01:57am, Kairui Song wrote:
> > From: Kairui Song <kasong@tencent.com>
> >
> > Slot cache was initially introduced by commit 67afa38e012e ("mm/swap:
> > add cache for swap slots allocation") to reduce the lock contention
> > of si->lock.
>
> Thanks for adding me in CC. While I got failure to apply this series
> to the latest mainline kernel, could you tell what is the base commit
> of this pathcset?
>
> Thanks
> Baoquan
>

Hi Baoquan,

It's based on Andrews's mm-unstable here:
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git

I've just re-checked, there should be no conflict. Sorry I didn't
include this info in the cover letter, mm development is rapid so
usually I send patch based on mm-unstable.

Baoquan He Feb. 15, 2025, 3:07 p.m. UTC | #3

On 02/15/25 at 09:34pm, Kairui Song wrote:
> On Sat, Feb 15, 2025 at 6:28 PM Baoquan He <bhe@redhat.com> wrote:
> >
> > Hi Kairui,
> >
> > On 02/15/25 at 01:57am, Kairui Song wrote:
> > > From: Kairui Song <kasong@tencent.com>
> > >
> > > Slot cache was initially introduced by commit 67afa38e012e ("mm/swap:
> > > add cache for swap slots allocation") to reduce the lock contention
> > > of si->lock.
> >
> > Thanks for adding me in CC. While I got failure to apply this series
> > to the latest mainline kernel, could you tell what is the base commit
> > of this pathcset?
> >
> > Thanks
> > Baoquan
> >
> 
> Hi Baoquan,
> 
> It's based on Andrews's mm-unstable here:
> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git
> 
> I've just re-checked, there should be no conflict. Sorry I didn't
> include this info in the cover letter, mm development is rapid so
> usually I send patch based on mm-unstable.

Thanks, applied to akpm-mm/mm-unstable branch cleanly.