From patchwork Mon Jun 26 18:21:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13293330 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C45BEB64D7 for ; Mon, 26 Jun 2023 18:21:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A10C48D0003; Mon, 26 Jun 2023 14:21:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C1838D0001; Mon, 26 Jun 2023 14:21:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 889D98D0003; Mon, 26 Jun 2023 14:21:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 7BB758D0001 for ; Mon, 26 Jun 2023 14:21:32 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 4D46D14079B for ; Mon, 26 Jun 2023 18:21:32 +0000 (UTC) X-FDA: 80945716824.15.A031969 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf14.hostedemail.com (Postfix) with ESMTP id 7D4FB100007 for ; Mon, 26 Jun 2023 18:21:30 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="iJ/EtPCc"; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf14.hostedemail.com: domain of cel@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cel@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687803690; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7JiGFgIZMBHLowUD8xCuSXRydVIwziTG9bwy3vMZ7J8=; b=4gOsYsr/Gy/BLn8pBjFKEEEhij1wacVdwxJ6zA+YACfptI9divVFcsn5S1Fy1MeUh8Eu2q UBbDVKvPuesf+CGnpHXEQElyCZkItgcAM3CP36GMeiLCDzKhAhvgyahhG3CS5tZQz5wKbT G+l3zczzDQIpwKlH0hNIDYNlx/sWdNk= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="iJ/EtPCc"; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf14.hostedemail.com: domain of cel@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cel@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687803690; a=rsa-sha256; cv=none; b=hZVeUAvidi+IOQKpLLZyopr+tPslQX6Ude3v6aEIqdJDQkrxXqNm4rXQYZ4fX02euYM3uO T9Prwo9mj2jlxF6PzcbJspcuT8z15e+L514NHiL76yOQj+VDyDcxSg5DfO9zN60PHZ12Oi BYjLQeCLirtFBnyn7cvczK0rs5IikVQ= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 81F4B60F6B; Mon, 26 Jun 2023 18:21:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 60C32C433CB; Mon, 26 Jun 2023 18:21:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1687803688; bh=1wHRscUWByrnZ9ONV+EzuEel/ddXYcfwzYGKwxTIgBo=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=iJ/EtPCcGyRNY0O7oTdG/ed8r9FNUIU5QbUQbLPmOq07D300LFC66jOFK1xIhRQ12 qo9DcHw8W/Pv77gIpWkwfjufhZXoNBfp7VP+4uubFwk02nbqwBJUtJ4XqHTzUa5hgg B0xClO4nSA8bTQp3yLpWrJunwMQPUhy1yTnVaSVsMz7Aefb8mb0glzmSMBbOmlD0gB GBCMH/ADbZDkJCKhaX5mV6ME+XnuVOf8xp3KIoBLSV0d5JVkhP3XdKxjV9eZ3ct/bV cOavoCOSiitED+w58+wdqPVnYqZnxk1ghKUKJJk3gJvOXMiWhY/71BjTGjKZdpn8xh ZI5SH2UajmO2A== Subject: [PATCH v4 1/3] libfs: Add directory operations for stable offsets From: Chuck Lever To: viro@zeniv.linux.org.uk, brauner@kernel.org, hughd@google.com, akpm@linux-foundation.org Cc: Chuck Lever , jlayton@redhat.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Date: Mon, 26 Jun 2023 14:21:27 -0400 Message-ID: <168780368739.2142.1909222585425739373.stgit@manet.1015granger.net> In-Reply-To: <168780354647.2142.537463116658872680.stgit@manet.1015granger.net> References: <168780354647.2142.537463116658872680.stgit@manet.1015granger.net> User-Agent: StGit/1.5 MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 7D4FB100007 X-Stat-Signature: 17foxdppaafjz49f9xk7nxidwn79np4q X-Rspam-User: X-HE-Tag: 1687803690-662253 X-HE-Meta: U2FsdGVkX18iGRAqk7Ek8lf3K6LNGJZ40VTm/rNNsDMvYgZ6gLmzMpbHKFXzaQA3G8mg61Vcawp9XOJiSA+iOueJM4n3ZiE0D9YJTcupuuQE3m78hgm7p9MIt/oEpkw5EYjkrFuVCr6Y1dkrLW0jVpDaQTN6VW/j5hV1j4T4iMHJRmhWMWDv3nS3K3cAkXEBZ60Ya5esmbF1ozqLvBfCZCFEsSd+LZ4bEoNsKtfriVwSKvo7tEN3kzn1Wyzxn9h4+GAjU9oz+OMIIavviO5eMwl/M+AYWlO78txiaIwyHUjXI8WstBsjK5/l4AAe+UnlMh06ypSb6jTkCr4CHftSOth3hHKRNJX6mwpGjGJoB0YWcH3HlpojQBXrVNq1S5UabLjLBowFZFeXHq3qzZKMpwNQL5qEuqJBQ7iglcjtQzr/iWvqI2PghNCKaaoMPdbxza1rasrMe/G1BeLnM9t6kMuASOZqrbSlyjqOHs1Gk659xqe+Yf3bW2a9FHz+5wtSTmUyGGmK/kmczwsYG3S5VbZ0T75DVtN7IA7iVdZ3jg6Aw7yKq1zcY3b3/1kxR29oh2oM5PoRcE+W4mBqgp5DWUN5RvjPlXYdAcc8kY4uEC4elD2jf7ytlAxoKzclRhgBjDk43qnKu22xVStulFCGfwe1/OJ9R62RWCf11WBDvt4WVEUfPlKa1B2lOsw70Y0RsdcmVEMsLBITgeZ3bZ08Zi5OlZHC3cH+1ine5srJnL9m8MIM3mCLpCF+4wap+1a4I9EWGYAIDzjQDohBrer3GWM1UgIJJwwhfOx2in1Wg/i1xuGPYNIPhO9D5qCkcgn+XuA3shMc575YEr9wYKPOU2GuADVxlgjBSUSR0V9ELVTzaURrCxwXPi7CIT2wBfo/Q0z6PuZg6EO7Vatg3TV9AmGTKYYqJIt/LtyPjVpv1aSzjdO9Zw8OvSH1UzKs855X9+0UCuzbG0PSrFwYiu7 HYLTPnHE HgR7EwpaK9uvobjX2sN4fzR57N6TVwNuSpFRTn5xxAd2fzeRvlhI/kP8dz8qMgC0QxQPT7tzwbnC9rb7lSiZ1MLLa1+GpBuXSh24+LUbmBRIuMuXNAgtwG4H68mSnPMjCiShLgFmpMnKAVVn7Q6ukQSKfzVqhrYfB9tHQxndUq3iALmlFb1ejWDX1D57WQe3XjVgchkNFkBxADhcUARRbk95RUnAZg2CGTZPIIaIT5QV/x53tntBu6XcnPWQ16/c7xk4UgkmUVTovPxYJslYfHRkD1A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Chuck Lever Create a vector of directory operations in fs/libfs.c that handles directory seeks and readdir via stable offsets instead of the current cursor-based mechanism. For the moment these are unused. Signed-off-by: Chuck Lever --- fs/dcache.c | 1 fs/libfs.c | 185 ++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/dcache.h | 1 include/linux/fs.h | 9 ++ 4 files changed, 196 insertions(+) diff --git a/fs/dcache.c b/fs/dcache.c index 52e6d5fdab6b..9c9a801f3b33 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1813,6 +1813,7 @@ static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) dentry->d_sb = sb; dentry->d_op = NULL; dentry->d_fsdata = NULL; + dentry->d_offset = 0; INIT_HLIST_BL_NODE(&dentry->d_hash); INIT_LIST_HEAD(&dentry->d_lru); INIT_LIST_HEAD(&dentry->d_subdirs); diff --git a/fs/libfs.c b/fs/libfs.c index 89cf614a3271..07317bbe1668 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -239,6 +239,191 @@ const struct inode_operations simple_dir_inode_operations = { }; EXPORT_SYMBOL(simple_dir_inode_operations); +/** + * stable_offset_init - initialize a parent directory + * @dir: parent directory to be initialized + * + */ +void stable_offset_init(struct inode *dir) +{ + xa_init_flags(&dir->i_doff_map, XA_FLAGS_ALLOC1); + dir->i_next_offset = 0; +} +EXPORT_SYMBOL(stable_offset_init); + +/** + * stable_offset_add - Add an entry to a directory's stable offset map + * @dir: parent directory being modified + * @dentry: new dentry being added + * + * Returns zero on success. Otherwise, a negative errno value is returned. + */ +int stable_offset_add(struct inode *dir, struct dentry *dentry) +{ + struct xa_limit limit = XA_LIMIT(2, U32_MAX); + u32 offset = 0; + int ret; + + if (dentry->d_offset) + return -EBUSY; + + ret = xa_alloc_cyclic(&dir->i_doff_map, &offset, dentry, limit, + &dir->i_next_offset, GFP_KERNEL); + if (ret < 0) + return ret; + + dentry->d_offset = offset; + return 0; +} +EXPORT_SYMBOL(stable_offset_add); + +/** + * stable_offset_remove - Remove an entry to a directory's stable offset map + * @dir: parent directory being modified + * @dentry: dentry being removed + * + */ +void stable_offset_remove(struct inode *dir, struct dentry *dentry) +{ + if (!dentry->d_offset) + return; + + xa_erase(&dir->i_doff_map, dentry->d_offset); + dentry->d_offset = 0; +} +EXPORT_SYMBOL(stable_offset_remove); + +/** + * stable_offset_destroy - Release offset map + * @dir: parent directory that is about to be destroyed + * + * During fs teardown (eg. umount), a directory's offset map might still + * contain entries. xa_destroy() cleans out anything that remains. + */ +void stable_offset_destroy(struct inode *dir) +{ + xa_destroy(&dir->i_doff_map); +} +EXPORT_SYMBOL(stable_offset_destroy); + +/** + * stable_dir_llseek - Advance the read position of a directory descriptor + * @file: an open directory whose position is to be updated + * @offset: a byte offset + * @whence: enumerator describing the starting position for this update + * + * SEEK_END, SEEK_DATA, and SEEK_HOLE are not supported for directories. + * + * Returns the updated read position if successful; otherwise a + * negative errno is returned and the read position remains unchanged. + */ +static loff_t stable_dir_llseek(struct file *file, loff_t offset, int whence) +{ + switch (whence) { + case SEEK_CUR: + offset += file->f_pos; + fallthrough; + case SEEK_SET: + if (offset >= 0) + break; + fallthrough; + default: + return -EINVAL; + } + + return vfs_setpos(file, offset, U32_MAX); +} + +static struct dentry *stable_find_next(struct xa_state *xas) +{ + struct dentry *child, *found = NULL; + + rcu_read_lock(); + child = xas_next_entry(xas, U32_MAX); + if (!child) + goto out; + spin_lock_nested(&child->d_lock, DENTRY_D_LOCK_NESTED); + if (simple_positive(child)) + found = dget_dlock(child); + spin_unlock(&child->d_lock); +out: + rcu_read_unlock(); + return found; +} + +static bool stable_dir_emit(struct dir_context *ctx, struct dentry *dentry) +{ + struct inode *inode = d_inode(dentry); + + return ctx->actor(ctx, dentry->d_name.name, dentry->d_name.len, + dentry->d_offset, inode->i_ino, + fs_umode_to_dtype(inode->i_mode)); +} + +static void stable_iterate_dir(struct dentry *dir, struct dir_context *ctx) +{ + XA_STATE(xas, &((d_inode(dir))->i_doff_map), ctx->pos); + struct dentry *dentry; + + while (true) { + spin_lock(&dir->d_lock); + dentry = stable_find_next(&xas); + spin_unlock(&dir->d_lock); + if (!dentry) + break; + + if (!stable_dir_emit(ctx, dentry)) { + dput(dentry); + break; + } + + dput(dentry); + ctx->pos = xas.xa_index + 1; + } +} + +/** + * stable_readdir - Emit entries starting at offset @ctx->pos + * @file: an open directory to iterate over + * @ctx: directory iteration context + * + * Caller must hold @file's i_rwsem to prevent insertion or removal of + * entries during this call. + * + * On entry, @ctx->pos contains an offset that represents the first entry + * to be read from the directory. + * + * The operation continues until there are no more entries to read, or + * until the ctx->actor indicates there is no more space in the caller's + * output buffer. + * + * On return, @ctx->pos contains an offset that will read the next entry + * in this directory when shmem_readdir() is called again with @ctx. + * + * Return values: + * %0 - Complete + */ +static int stable_readdir(struct file *file, struct dir_context *ctx) +{ + struct dentry *dir = file->f_path.dentry; + + lockdep_assert_held(&d_inode(dir)->i_rwsem); + + if (!dir_emit_dots(file, ctx)) + return 0; + + stable_iterate_dir(dir, ctx); + return 0; +} + +const struct file_operations stable_dir_operations = { + .llseek = stable_dir_llseek, + .iterate_shared = stable_readdir, + .read = generic_read_dir, + .fsync = noop_fsync, +}; +EXPORT_SYMBOL(stable_dir_operations); + static struct dentry *find_next_child(struct dentry *parent, struct dentry *prev) { struct dentry *child = NULL; diff --git a/include/linux/dcache.h b/include/linux/dcache.h index 6b351e009f59..579ce1800efe 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -96,6 +96,7 @@ struct dentry { struct super_block *d_sb; /* The root of the dentry tree */ unsigned long d_time; /* used by d_revalidate */ void *d_fsdata; /* fs-specific data */ + u32 d_offset; /* directory offset in parent */ union { struct list_head d_lru; /* LRU list */ diff --git a/include/linux/fs.h b/include/linux/fs.h index 133f0640fb24..3fc2c04ed8ff 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -719,6 +719,10 @@ struct inode { #endif void *i_private; /* fs or device private pointer */ + + /* simplefs stable directory offset tracking */ + struct xarray i_doff_map; + u32 i_next_offset; } __randomize_layout; struct timespec64 timestamp_truncate(struct timespec64 t, struct inode *inode); @@ -2924,6 +2928,10 @@ extern int simple_rename(struct mnt_idmap *, struct inode *, unsigned int); extern void simple_recursive_removal(struct dentry *, void (*callback)(struct dentry *)); +extern void stable_offset_init(struct inode *dir); +extern int stable_offset_add(struct inode *dir, struct dentry *dentry); +extern void stable_offset_remove(struct inode *dir, struct dentry *dentry); +extern void stable_offset_destroy(struct inode *dir); extern int noop_fsync(struct file *, loff_t, loff_t, int); extern ssize_t noop_direct_IO(struct kiocb *iocb, struct iov_iter *iter); extern int simple_empty(struct dentry *); @@ -2939,6 +2947,7 @@ extern const struct dentry_operations simple_dentry_operations; extern struct dentry *simple_lookup(struct inode *, struct dentry *, unsigned int flags); extern ssize_t generic_read_dir(struct file *, char __user *, size_t, loff_t *); extern const struct file_operations simple_dir_operations; +extern const struct file_operations stable_dir_operations; extern const struct inode_operations simple_dir_inode_operations; extern void make_empty_dir_inode(struct inode *inode); extern bool is_empty_dir_inode(struct inode *inode);