[0/4] mm/filemap: optimize folio adding and splitting

Message ID	20240319092733.4501-1-ryncsn@gmail.com (mailing list archive)
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Kairui Song <ryncsn@gmail.com> To: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org>, Matthew Wilcox <willy@infradead.org>, linux-kernel@vger.kernel.org, Kairui Song <kasong@tencent.com> Subject: [PATCH 0/4] mm/filemap: optimize folio adding and splitting Date: Tue, 19 Mar 2024 17:27:29 +0800 Message-ID: <20240319092733.4501-1-ryncsn@gmail.com> Reply-To: Kairui Song <kasong@tencent.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	mm/filemap: optimize folio adding and splitting \| expand [0/4] mm/filemap: optimize folio adding and splitting [1/4] mm/filemap: return early if failed to allocate memory for split [2/4] mm/filemap: clean up hugetlb exclusion code [3/4] lib/xarray: introduce a new helper xas_get_order [4/4] mm/filemap: optimize filemap folio adding

Message ID

20240319092733.4501-1-ryncsn@gmail.com (mailing list archive)

Headers

From: Kairui Song <ryncsn@gmail.com>
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	linux-kernel@vger.kernel.org,
	Kairui Song <kasong@tencent.com>
Subject: [PATCH 0/4] mm/filemap: optimize folio adding and splitting
Date: Tue, 19 Mar 2024 17:27:29 +0800
Message-ID: <20240319092733.4501-1-ryncsn@gmail.com>
Reply-To: Kairui Song <kasong@tencent.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

mm/filemap: optimize folio adding and splitting | expand

Message

Kairui Song March 19, 2024, 9:27 a.m. UTC

From: Kairui Song <kasong@tencent.com>

Currently, at least 3 tree walks are needed for filemap folio adding a
previously evicted folio. One for getting the order, one for ranged conflict
check, and one for another order retrieving. If a split is needed, more walks
are needed.

This series is trying to merge these walks, and speed up filemap_add_folio.

Instead of doing multiple tree walks, do one optimism range check
with lock hold, and exit if raced with another insertion. If a shadow
exists, check it with a new xas_get_order helper before releasing the
lock to avoid redundant tree walks for getting its order.

Drop the lock and do the allocation only if a split is needed.

In the best case, it only need to walk the tree once. If it needs
to alloc and split, 3 walks are issued (One for first ranced
conflict check and order retrieving, one for the second check after
allocation, one for the insert after split).

Testing with 4k pages, in an 8G cgroup, with 20G brd as block device:

fio -name=cached --numjobs=16 --filename=/mnt/test.img \
  --buffered=1 --ioengine=mmap --rw=randread --time_based \
  --ramp_time=30s --runtime=5m --group_reporting

Before:
bw (  MiB/s): min=  790, max= 3665, per=100.00%, avg=2499.17, stdev=20.64, samples=8698
iops        : min=202295, max=938417, avg=639785.81, stdev=5284.08, samples=8698

After (+4%):
bw (  MiB/s): min=  451, max= 3868, per=100.00%, avg=2599.83, stdev=23.39, samples=8653
iops        : min=115596, max=990364, avg=665556.34, stdev=5988.20, samples=8653

Test result with THP (do a THP test then switch to 4K page in hope it
issues a lot of splitting):

fio -name=cached --numjobs=16 --filename=/mnt/test.img \
  --buffered=1 --ioengine mmap -thp=1 --readonly \
  --rw=randread --random_distribution=random \
  --time_based --runtime=5m --group_reporting

fio -name=cached --numjobs=16 --filename=/mnt/test.img \
  --buffered=1 --ioengine mmap --readonly \
  --rw=randread --random_distribution=random \
  --time_based --runtime=5s --group_reporting

Before:
bw (  KiB/s): min=28071, max=62359, per=100.00%, avg=53542.44, stdev=179.77, samples=9520
iops        : min= 7012, max=15586, avg=13379.39, stdev=44.94, samples=9520
bw (  MiB/s): min= 2457, max= 6193, per=100.00%, avg=3923.21, stdev=82.48, samples=144
iops        : min=629220, max=1585642, avg=1004340.78, stdev=21116.07, samples=144

After (+-0.0%):
bw (  KiB/s): min=30561, max=63064, per=100.00%, avg=53635.82, stdev=177.21, samples=9520
iops        : min= 7636, max=15762, avg=13402.82, stdev=44.29, samples=9520
bw (  MiB/s): min= 2449, max= 6145, per=100.00%, avg=3914.68, stdev=81.15, samples=144
iops        : min=627106, max=1573156, avg=1002158.11, stdev=20774.77, samples=144

The performance is better (+4%) for 4K cached read and unchanged for THP.

Kairui Song (4):
  mm/filemap: return early if failed to allocate memory for split
  mm/filemap: clean up hugetlb exclusion code
  lib/xarray: introduce a new helper xas_get_order
  mm/filemap: optimize filemap folio adding

 include/linux/xarray.h |   6 ++
 lib/xarray.c           |  49 +++++++++-----
 mm/filemap.c           | 145 ++++++++++++++++++++++++-----------------
 3 files changed, 121 insertions(+), 79 deletions(-)