From patchwork Sat Feb 17 20:24:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13561591 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2D82C48BC4 for ; Sat, 17 Feb 2024 20:24:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 848FD8D000A; Sat, 17 Feb 2024 15:24:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7F8A48D0001; Sat, 17 Feb 2024 15:24:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 699D38D000A; Sat, 17 Feb 2024 15:24:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 549558D0001 for ; Sat, 17 Feb 2024 15:24:21 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id F00BC1C029B for ; Sat, 17 Feb 2024 20:24:20 +0000 (UTC) X-FDA: 81802423080.27.6C517DA Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf04.hostedemail.com (Postfix) with ESMTP id 4632A40004 for ; Sat, 17 Feb 2024 20:24:19 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Hily6QXJ; spf=pass (imf04.hostedemail.com: domain of cel@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cel@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708201459; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=enSJEC+a309h9BZDKMG+FAfIIaZiXnFN47ab++juM4Y=; b=GqWEpNeyEQhPR+0QN8bWN+lX7MIrvke5jOQnNt7PoO0i69C1BFbt+28lH+iULylz6LOcGb BNN0x1ixJzlboN1Eg0vQoJ0CcAW34Eo44LIB+OsJD1dGnqKi30/8AYU5jKJELKIO5fMKVI jEIIFmnaWaw0Ap//3x31x5HcGlf45Gw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708201459; a=rsa-sha256; cv=none; b=XmI1RmOkiCQ9P/qNlMSyfCCKRPFVr07Q9YYDBSusBoEVKP967rriHW3UBBAsaQFqPAkpeU 8Fe3EAWFfSbrVnU/StDObvc24gSH+Crma1YczK+SxrlSiw43/xY8qegZWTsI1sKKbhKc4n cGvf+VFFJOjpC7ie4aqIaSCPxfi+nAY= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Hily6QXJ; spf=pass (imf04.hostedemail.com: domain of cel@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cel@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 7026B60916; Sat, 17 Feb 2024 20:24:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1B1A2C433F1; Sat, 17 Feb 2024 20:24:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1708201458; bh=iJhBU4UKeak6qEJexG79zX3vwwtlWw+1xSLFESXosd8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Hily6QXJZ3MvOQfoHfmflOHSKjso6T7su5pOg1uRjgLZmE6bAK8Lyq/uERlhPLE6t sX2yH/nhAs6Ss3kB2NDRWijgMg9D+5/x3WC+AeJxc4pcG5reGPcPo6z8xuAWCuOWOI Ve02Le0hloVNEKIKxliDpYgSi+MVDE33yPWaXwDbMyrBVYu1c1ly4yey/kR9SxpuYM mpDRY6f9LSwab7UCGnvQvL/WFnHXaCg+vVXWzlMIdHBJy8j6lccu1EowhuPknUYwuw 9OXU/VwaeYJHMptnia1WNs5QVLPJhBk6e07ogG7mlf/U7xZRkhz9IJgiQxgzpSwjib TXq/fl/0XvXHg== Subject: [PATCH v2 6/6] libfs: Convert simple directory offsets to use a Maple Tree From: Chuck Lever To: viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, hughd@google.com, akpm@linux-foundation.org, Liam.Howlett@oracle.com, oliver.sang@intel.com, feng.tang@intel.com Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, maple-tree@lists.infradead.org, linux-mm@kvack.org, lkp@intel.com Date: Sat, 17 Feb 2024 15:24:16 -0500 Message-ID: <170820145616.6328.12620992971699079156.stgit@91.116.238.104.host.secureserver.net> In-Reply-To: <170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net> References: <170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net> User-Agent: StGit/1.5 MIME-Version: 1.0 X-Rspamd-Queue-Id: 4632A40004 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: fzh31ujz3u6wqbjggzna7zr4ncbfmjys X-HE-Tag: 1708201459-927519 X-HE-Meta: U2FsdGVkX1/qEMMCbmCT+aZV1RRV5YKlnQoyarmXzW6zk2H884yjYwTHJ2xNf+z2dB+doYHuCN1BNhNLR3cPbLh9LcrVtmmsqYEYbsesvhUakhn7fJjyfGYbKaknUTdZ7NO22kppVWndNKWUTKxY/EAhcNhLd22iFBGP56E81nFOsD/+VfBW90BPBuGpZZY1zrYYC5qMK9Ew6s3Y/ztEbehQJfL3MZygS1gSPE8D7oKnfe5PpnH0kcNnF/CZHU+I1VVVQGEqr7xD9OU78q5qkl7sNNKj5wT5Sl9xLY89TdkTloj9b5EE3DlFohssDvhtnV97AcGk5tIRqk4VCK/ipqxNt4qUdviKZ+PONSuWHXMSp8DWDMr4KxhOgEOR24JMyWlnKCMf9tCOCisGhn3hAmoA9xdxR7oAT5O8DHK8olmyRkC/5mSYo6o4WfdTR2RXRn9Jn7sf8mg77Vyqqs7gSZpXX7ESNhnVWaf5wqPpEpPWc30LYynix1GC0c0Wkf+2F5Va4abjIPZ/RFOQM1AWG1JV4QwUUA/DKvKTMYmn7YH3+UpGBwf3Q+vuK5cC5ZBptNDIhCvjAANMdzGbrrbOwvnC+E2r6Ctea6m3dHIDgrTAdtJbbs3KyfQwmJBqEEAxxRZAM9BTFTS0IzduNHapclGINjK+gZl1tzWz8p6xmuKU1WrTl6paJscvTaN7ePR2GK7wGEFFVdUzOvK3msCQOZH/jFmZBHBNrVbG/d184EYIH8eXJwQQ1mZMKK0nveq0M7oUiXoe7ZjE9HGA5m276Ux2370xlJ4M6U77EHQ/8LosooMRs6RfGrTrtoYtapFcRLbxkzVClo+BUJSxRV/R72DsfER0lNTctHMO7RyJaAzAnikPvIrk/GKC6nrK/Tl6nFOrpVkHWX/lmNgHk8rgMVSTylPg32lFIZxlg6D77YDX4WVJibebxTtXEAQ6eDm2fNnWcSkLjHX5Gpb5gt9 aEDujFnE c9cic9MYS3JyrRujOWXfI67wzywe+j/GqzZUd2ES4Ij5eAl7nVvTJg1WaIOwBvdsBOLqYf4RgOnxtDTxnRGwoxyLAY8ks1xZ+2BT7sGkbnV4G5/suXyEL8pDDObrdD9MhdFTURqWJB6LxdHeSf3J3iVlr9uV7l6MBZJWsrQmUeY6kK4iqWhyeuEWuOnj7LLaLm1oLQZQsqIi/rlDK4vmGQ9DghPpMsRqv/4KnhXYqIXj2dEqpsDKB0I2CpdWIw42PzXumNFmrR5FSE7aS6urhJ7Jwsrl48ltx/yhxY7BXy/TneOP7y9UfW76jOZussUcA0yrGAJaCskcwHR1su8S84r7TZAUOkTYI+S2NTCwSQ8CIyY5qVm/0f3felg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Chuck Lever Test robot reports: > kernel test robot noticed a -19.0% regression of aim9.disk_src.ops_per_sec on: > > commit: a2e459555c5f9da3e619b7e47a63f98574dc75f1 ("shmem: stable directory offsets") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master Feng Tang further clarifies that: > ... the new simple_offset_add() > called by shmem_mknod() brings extra cost related with slab, > specifically the 'radix_tree_node', which cause the regression. Willy's analysis is that, over time, the test workload causes xa_alloc_cyclic() to fragment the underlying SLAB cache. This patch replaces the offset_ctx's xarray with a Maple Tree in the hope that Maple Tree's dense node mode will handle this scenario more scalably. In addition, we can widen the simple directory offset maximum to signed long (as loff_t is also signed). Suggested-by: Matthew Wilcox Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-lkp/202309081306.3ecb3734-oliver.sang@intel.com Signed-off-by: Chuck Lever Reviewed-by: Jan Kara --- fs/libfs.c | 47 +++++++++++++++++++++++------------------------ include/linux/fs.h | 5 +++-- 2 files changed, 26 insertions(+), 26 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index f7f92a49a418..d3d31197c8e4 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -245,17 +245,17 @@ enum { DIR_OFFSET_MIN = 2, }; -static void offset_set(struct dentry *dentry, u32 offset) +static void offset_set(struct dentry *dentry, long offset) { - dentry->d_fsdata = (void *)((uintptr_t)(offset)); + dentry->d_fsdata = (void *)offset; } -static u32 dentry2offset(struct dentry *dentry) +static long dentry2offset(struct dentry *dentry) { - return (u32)((uintptr_t)(dentry->d_fsdata)); + return (long)dentry->d_fsdata; } -static struct lock_class_key simple_offset_xa_lock; +static struct lock_class_key simple_offset_lock_class; /** * simple_offset_init - initialize an offset_ctx @@ -264,8 +264,8 @@ static struct lock_class_key simple_offset_xa_lock; */ void simple_offset_init(struct offset_ctx *octx) { - xa_init_flags(&octx->xa, XA_FLAGS_ALLOC1); - lockdep_set_class(&octx->xa.xa_lock, &simple_offset_xa_lock); + mt_init_flags(&octx->mt, MT_FLAGS_ALLOC_RANGE); + lockdep_set_class(&octx->mt.ma_lock, &simple_offset_lock_class); octx->next_offset = DIR_OFFSET_MIN; } @@ -274,20 +274,19 @@ void simple_offset_init(struct offset_ctx *octx) * @octx: directory offset ctx to be updated * @dentry: new dentry being added * - * Returns zero on success. @so_ctx and the dentry offset are updated. + * Returns zero on success. @octx and the dentry's offset are updated. * Otherwise, a negative errno value is returned. */ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry) { - static const struct xa_limit limit = XA_LIMIT(DIR_OFFSET_MIN, U32_MAX); - u32 offset; + unsigned long offset; int ret; if (dentry2offset(dentry) != 0) return -EBUSY; - ret = xa_alloc_cyclic(&octx->xa, &offset, dentry, limit, - &octx->next_offset, GFP_KERNEL); + ret = mtree_alloc_cyclic(&octx->mt, &offset, dentry, DIR_OFFSET_MIN, + LONG_MAX, &octx->next_offset, GFP_KERNEL); if (ret < 0) return ret; @@ -303,13 +302,13 @@ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry) */ void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry) { - u32 offset; + long offset; offset = dentry2offset(dentry); if (offset == 0) return; - xa_erase(&octx->xa, offset); + mtree_erase(&octx->mt, offset); offset_set(dentry, 0); } @@ -332,7 +331,7 @@ int simple_offset_empty(struct dentry *dentry) index = DIR_OFFSET_MIN; octx = inode->i_op->get_offset_ctx(inode); - xa_for_each(&octx->xa, index, child) { + mt_for_each(&octx->mt, child, index, LONG_MAX) { spin_lock(&child->d_lock); if (simple_positive(child)) { spin_unlock(&child->d_lock); @@ -362,8 +361,8 @@ int simple_offset_rename_exchange(struct inode *old_dir, { struct offset_ctx *old_ctx = old_dir->i_op->get_offset_ctx(old_dir); struct offset_ctx *new_ctx = new_dir->i_op->get_offset_ctx(new_dir); - u32 old_index = dentry2offset(old_dentry); - u32 new_index = dentry2offset(new_dentry); + long old_index = dentry2offset(old_dentry); + long new_index = dentry2offset(new_dentry); int ret; simple_offset_remove(old_ctx, old_dentry); @@ -389,9 +388,9 @@ int simple_offset_rename_exchange(struct inode *old_dir, out_restore: offset_set(old_dentry, old_index); - xa_store(&old_ctx->xa, old_index, old_dentry, GFP_KERNEL); + mtree_store(&old_ctx->mt, old_index, old_dentry, GFP_KERNEL); offset_set(new_dentry, new_index); - xa_store(&new_ctx->xa, new_index, new_dentry, GFP_KERNEL); + mtree_store(&new_ctx->mt, new_index, new_dentry, GFP_KERNEL); return ret; } @@ -404,7 +403,7 @@ int simple_offset_rename_exchange(struct inode *old_dir, */ void simple_offset_destroy(struct offset_ctx *octx) { - xa_destroy(&octx->xa); + mtree_destroy(&octx->mt); } /** @@ -434,16 +433,16 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence) /* In this case, ->private_data is protected by f_pos_lock */ file->private_data = NULL; - return vfs_setpos(file, offset, U32_MAX); + return vfs_setpos(file, offset, LONG_MAX); } static struct dentry *offset_find_next(struct offset_ctx *octx, loff_t offset) { + MA_STATE(mas, &octx->mt, offset, offset); struct dentry *child, *found = NULL; - XA_STATE(xas, &octx->xa, offset); rcu_read_lock(); - child = xas_next_entry(&xas, U32_MAX); + child = mas_find(&mas, LONG_MAX); if (!child) goto out; spin_lock(&child->d_lock); @@ -457,8 +456,8 @@ static struct dentry *offset_find_next(struct offset_ctx *octx, loff_t offset) static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry) { - u32 offset = dentry2offset(dentry); struct inode *inode = d_inode(dentry); + long offset = dentry2offset(dentry); return ctx->actor(ctx, dentry->d_name.name, dentry->d_name.len, offset, inode->i_ino, fs_umode_to_dtype(inode->i_mode)); diff --git a/include/linux/fs.h b/include/linux/fs.h index 03d141809a2c..55144c12ee0f 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -43,6 +43,7 @@ #include #include #include +#include #include #include @@ -3260,8 +3261,8 @@ extern ssize_t simple_write_to_buffer(void *to, size_t available, loff_t *ppos, const void __user *from, size_t count); struct offset_ctx { - struct xarray xa; - u32 next_offset; + struct maple_tree mt; + unsigned long next_offset; }; void simple_offset_init(struct offset_ctx *octx);