From patchwork Mon Jun 13 23:18:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12880298 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00D67C433EF for ; Mon, 13 Jun 2022 23:20:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234353AbiFMXUI (ORCPT ); Mon, 13 Jun 2022 19:20:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51292 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229488AbiFMXUF (ORCPT ); Mon, 13 Jun 2022 19:20:05 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDBBBBC32; Mon, 13 Jun 2022 16:20:02 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 8892821AA0; Mon, 13 Jun 2022 23:20:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1655162401; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IAyx4lnT48FNRLjRvrBUnHtS5K5G19758aNWXIlREG4=; b=IiVqI0b+7Ysyf8nrZ9y1Fz+NGolcwI1S3XvOKvnN4syPmynHdxzlART7ErI5chTFYYSqm+ B59GelGMpXsAEXabaPat3pqVoKtqB4gbSIOIGiyS/+D8BFvd1ftfC9I8erAkYhaY8lq92u iFFZktTzXMpOUcNl5AicjC9eZes5uBg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1655162401; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IAyx4lnT48FNRLjRvrBUnHtS5K5G19758aNWXIlREG4=; b=B97reOGAcYHj6qampMy/vPSVYE0MYJSMh/Vs3ZNVKGTjdaCn+On8FphJCeMxTUvU5dqf8s x/FRi8QvJp9Dx5DQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 1F470134CF; Mon, 13 Jun 2022 23:19:58 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id MyjIMx7Gp2KbbwAAMHmgww (envelope-from ); Mon, 13 Jun 2022 23:19:58 +0000 Subject: [PATCH 01/12] VFS: support parallel updates in the one directory. From: NeilBrown To: Al Viro , Daire Byrne , Trond Myklebust , Chuck Lever Cc: Linux NFS Mailing List , linux-fsdevel@vger.kernel.org, LKML Date: Tue, 14 Jun 2022 09:18:21 +1000 Message-ID: <165516230195.21248.5682920287666811819.stgit@noble.brown> In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown> References: <165516173293.21248.14587048046993234326.stgit@noble.brown> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: NeilBrown Some filesystems can support parallel modifications to a directory, either because the modification happen on a remote server which does its own locking (e.g. NFS) or because they can internally lock just a part of a directory (e.g. many local filesystems, with a bit of work - the lustre project has patches for ext4 to support concurrent updates). To allow this, we introduce VFS support for parallel modification: unlink (including rmdir) and create. Parallel rename is added in a later patch. If a filesystem supports parallel modification in a given directory, it sets S_PAR_UPDATE on the inode for that directory. lookup_open() and the new lookup_hash_update() notice the flag and take a shared lock on the directory, and rely on a lock-bit in d_flags, much like parallel lookup relies on DCACHE_PAR_LOOKUP. __lookup_hash() is enhanced to use d_alloc_parallel() if it was told that a shared lock was taken - by being given a non-NULL *wq. The new DCACHE_PAR_UPDATE is the most significant bit in d_flags, so we have nearly run out of flags. 0x08000000 is still unused .... for now. Once a dentry for the target name has been obtained, DCACHE_PAR_UPDATE is set on it, waiting if necessary if it is already set. Once this is set, the thread has exclusive access to the name and can call into the filesystem to perform the required action. We use wait_var_event() to wait for DCACHE_PAR_UPDATE to be cleared rather than trying to overload the wq used for DCACHE_PAR_LOOKUP. Some filesystems do *not* complete the lookup that precedes a create, but leave the dentry d_in_lookup() and unhashed, so often a dentry will have both DCACHE_PAR_LOOKUP and DCACHE_PAR_UPDATE set at the same time. To allow for this, we need the 'wq' that is passed to d_alloc_parallel() (and used when DCACHE_PAR_LOOKUP is cleared), to continue to exist until the creation is complete, which may be after filename_create() completes. So a wq now needs to be passed to filename_create() when parallel modification is attempted. Waiting for DCACHE_PAR_UPDATE can sometimes result in the dentry changing - either through unlink or rename. This requires that the lookup be retried. This in turn requires that the wq be reused. The use for DECLARE_WAITQUEUE() in d_wait_lookup() and the omission of remove_wait_queue() means that after the wake_up_all() in __d_lookup_down(), the wait_queue_head list will be undefined. So we change d_wait_lookup() to use DEFINE_WAIT(), so that autoremove_wake_function() is used for wakeups, and the wait_queue_head remains well defined. Signed-off-by: NeilBrown --- fs/dcache.c | 59 ++++++++++ fs/namei.c | 277 ++++++++++++++++++++++++++++++++++++++---------- include/linux/dcache.h | 27 +++++ include/linux/fs.h | 1 include/linux/namei.h | 13 ++ 5 files changed, 317 insertions(+), 60 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 93f4f5ee07bf..1b523b512575 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1765,6 +1765,7 @@ static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) struct dentry *dentry; char *dname; int err; + static struct lock_class_key __key; dentry = kmem_cache_alloc_lru(dentry_cache, &sb->s_dentry_lru, GFP_KERNEL); @@ -1820,6 +1821,8 @@ static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) INIT_LIST_HEAD(&dentry->d_child); d_set_d_op(dentry, dentry->d_sb->s_d_op); + lockdep_init_map(&dentry->d_update_map, "DENTRY_PAR_UPDATE", &__key, 0); + if (dentry->d_op && dentry->d_op->d_init) { err = dentry->d_op->d_init(dentry); if (err) { @@ -2580,7 +2583,7 @@ static inline void end_dir_add(struct inode *dir, unsigned n) static void d_wait_lookup(struct dentry *dentry) { if (d_in_lookup(dentry)) { - DECLARE_WAITQUEUE(wait, current); + DEFINE_WAIT(wait); add_wait_queue(dentry->d_wait, &wait); do { set_current_state(TASK_UNINTERRUPTIBLE); @@ -3209,6 +3212,60 @@ void d_tmpfile(struct dentry *dentry, struct inode *inode) } EXPORT_SYMBOL(d_tmpfile); +/** + * d_lock_update_nested - lock a dentry before updating + * @dentry: the dentry to be locked + * @base: the parent, or %NULL + * @name: the name in that parent, or %NULL + * @subclass: lockdep locking class. + * + * Lock a dentry in a directory on which a shared-lock is held, and + * on which parallel updates are permitted. + * If the base and name are given, then on success the dentry will still + * have that base and name - it will not have raced with rename. + * On success, a positive dentry will still be hashed, ensuring there + * was no race with unlink. + * If they are not given, then on success the dentry will be negative, + * which again ensures no race with rename, or unlink. + */ +bool d_lock_update_nested(struct dentry *dentry, + struct dentry *base, const struct qstr *name, + int subclass) +{ + bool ret = true; + + lock_acquire_exclusive(&dentry->d_update_map, subclass, + 0, NULL, _THIS_IP_); + spin_lock(&dentry->d_lock); + if (dentry->d_flags & DCACHE_PAR_UPDATE) + ___wait_var_event(&dentry->d_flags, + !(dentry->d_flags & DCACHE_PAR_UPDATE), + TASK_UNINTERRUPTIBLE, 0, 0, + (spin_unlock(&dentry->d_lock), + schedule(), + spin_lock(&dentry->d_lock)) + ); + if (d_unhashed(dentry) && d_is_positive(dentry)) { + /* name was unlinked while we waited */ + ret = false; + } else if (base && + (dentry->d_parent != base || + dentry->d_name.hash != name->hash || + !d_same_name(dentry, base, name))) { + /* dentry was renamed - possibly silly-rename */ + ret = false; + } else if (!base && d_is_positive(dentry)) { + ret = false; + } else { + dentry->d_flags |= DCACHE_PAR_UPDATE; + } + spin_unlock(&dentry->d_lock); + if (!ret) + lock_map_release(&dentry->d_update_map); + return ret; +} +EXPORT_SYMBOL(d_lock_update_nested); + static __initdata unsigned long dhash_entries; static int __init set_dhash_entries(char *str) { diff --git a/fs/namei.c b/fs/namei.c index 1f28d3f463c3..9a2ca515c219 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1575,14 +1575,13 @@ static struct dentry *lookup_dcache(const struct qstr *name, } /* - * Parent directory has inode locked exclusive. This is one - * and only case when ->lookup() gets called on non in-lookup - * dentries - as the matter of fact, this only gets called - * when directory is guaranteed to have no in-lookup children - * at all. + * Parent directory has inode locked exclusive, or shared if wq is + * given. In the later case the fs has explicitly allowed concurrent + * creates in this directory. */ static struct dentry *__lookup_hash(const struct qstr *name, - struct dentry *base, unsigned int flags) + struct dentry *base, unsigned int flags, + wait_queue_head_t *wq) { struct dentry *dentry = lookup_dcache(name, base, flags); struct dentry *old; @@ -1595,18 +1594,107 @@ static struct dentry *__lookup_hash(const struct qstr *name, if (unlikely(IS_DEADDIR(dir))) return ERR_PTR(-ENOENT); - dentry = d_alloc(base, name); + if (wq) + dentry = d_alloc_parallel(base, name, wq); + else + dentry = d_alloc(base, name); if (unlikely(!dentry)) return ERR_PTR(-ENOMEM); + if (IS_ERR(dentry)) + return dentry; + + if (wq && d_in_lookup(dentry)) + /* Must have raced with another thread doing the lookup */ + return dentry; old = dir->i_op->lookup(dir, dentry, flags); if (unlikely(old)) { + d_lookup_done(dentry); dput(dentry); dentry = old; } return dentry; } +/* + * Parent directory (base) is not locked. We take either an exclusive + * or shared lock depending on the fs preference, then do a lookup, + * and then set the DCACHE_PAR_UPDATE bit on the child if a shared lock + * was taken on the parent. + */ +static struct dentry *lookup_hash_update(const struct qstr *name, + struct dentry *base, unsigned int flags, + wait_queue_head_t *wq) +{ + struct dentry *dentry; + struct inode *dir = base->d_inode; + int err; + + if (!(dir->i_flags & S_PAR_UPDATE)) + wq = NULL; + + if (wq) + inode_lock_shared_nested(dir, I_MUTEX_PARENT); + else + inode_lock_nested(dir, I_MUTEX_PARENT); + +retry: + dentry = __lookup_hash(name, base, flags, wq); + if (IS_ERR(dentry)) { + err = PTR_ERR(dentry); + goto out_err; + } + if (wq && !d_lock_update(dentry, base, name)) { + /* + * Failed to get lock due to race with unlink or rename + * - try again + */ + d_lookup_done(dentry); + dput(dentry); + goto retry; + } + return dentry; + +out_err: + if (wq) + inode_unlock_shared(dir); + else + inode_unlock(dir); + return ERR_PTR(err); +} + +static int lookup_one_common(struct user_namespace *mnt_userns, + const char *name, struct dentry *base, int len, + struct qstr *this); + +struct dentry *lookup_hash_update_len(const char *name, int nlen, + struct path *path, unsigned int flags, + wait_queue_head_t *wq) +{ + struct qstr this; + int err = lookup_one_common(mnt_user_ns(path->mnt), name, + path->dentry, nlen, &this); + if (err) + return ERR_PTR(err); + return lookup_hash_update(&this, path->dentry, flags, wq); +} +EXPORT_SYMBOL(lookup_hash_update_len); + +static void done_path_update(struct path *path, struct dentry *dentry, + wait_queue_head_t *wq) +{ + struct inode *dir = path->dentry->d_inode; + bool shared = (dir->i_flags & S_PAR_UPDATE) && wq; + + if (shared) { + d_lookup_done(dentry); + d_unlock_update(dentry); + inode_unlock_shared(dir); + } else { + inode_unlock(dir); + } +} + static struct dentry *lookup_fast(struct nameidata *nd, struct inode **inode, unsigned *seqp) @@ -2589,7 +2677,7 @@ static struct dentry *__kern_path_locked(struct filename *name, struct path *pat return ERR_PTR(-EINVAL); } inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); - d = __lookup_hash(&last, path->dentry, 0); + d = __lookup_hash(&last, path->dentry, 0, NULL); if (IS_ERR(d)) { inode_unlock(path->dentry->d_inode); path_put(path); @@ -3226,16 +3314,32 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry, { struct dentry *const DENTRY_NOT_SET = (void *) -1UL; struct inode *dir = nd->path.dentry->d_inode; + bool have_par_update = false; int error; if (nd->flags & LOOKUP_DIRECTORY) open_flag |= O_DIRECTORY; + /* + * dentry is negative or d_in_lookup(). If this is a + * shared-lock create we need to set DCACHE_PAR_UPDATE to ensure + * exclusion. + */ + if ((open_flag & O_CREAT) && + (dir->i_flags & S_PAR_UPDATE)) { + if (!d_lock_update(dentry, NULL, NULL)) + /* already exists, non-atomic open */ + return dentry; + have_par_update = true; + } + file->f_path.dentry = DENTRY_NOT_SET; file->f_path.mnt = nd->path.mnt; error = dir->i_op->atomic_open(dir, dentry, file, open_to_namei_flags(open_flag), mode); d_lookup_done(dentry); + if (have_par_update) + d_unlock_update(dentry); if (!error) { if (file->f_mode & FMODE_OPENED) { if (unlikely(dentry != file->f_path.dentry)) { @@ -3287,6 +3391,7 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, int error, create_error = 0; umode_t mode = op->mode; DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); + bool have_par_update = false; if (unlikely(IS_DEADDIR(dir_inode))) return ERR_PTR(-ENOENT); @@ -3361,6 +3466,14 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, dentry = res; } } + /* + * If dentry is negative and this is a shared-lock + * create we need to get DCACHE_PAR_UPDATE to ensure exclusion + */ + if ((open_flag & O_CREAT) && + !dentry->d_inode && + (dir_inode->i_flags & S_PAR_UPDATE)) + have_par_update = d_lock_update(dentry, NULL, NULL); /* Negative dentry, just create the file */ if (!dentry->d_inode && (open_flag & O_CREAT)) { @@ -3380,9 +3493,13 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, error = create_error; goto out_dput; } + if (have_par_update) + d_unlock_update(dentry); return dentry; out_dput: + if (have_par_update) + d_unlock_update(dentry); dput(dentry); return ERR_PTR(error); } @@ -3397,6 +3514,7 @@ static const char *open_last_lookups(struct nameidata *nd, struct inode *inode; struct dentry *dentry; const char *res; + bool shared; nd->flags |= op->intent; @@ -3437,14 +3555,15 @@ static const char *open_last_lookups(struct nameidata *nd, * dropping this one anyway. */ } - if (open_flag & O_CREAT) + shared = !!(dir->d_inode->i_flags & S_PAR_UPDATE); + if ((open_flag & O_CREAT) && !shared) inode_lock(dir->d_inode); else inode_lock_shared(dir->d_inode); dentry = lookup_open(nd, file, op, got_write); if (!IS_ERR(dentry) && (file->f_mode & FMODE_CREATED)) fsnotify_create(dir->d_inode, dentry); - if (open_flag & O_CREAT) + if ((open_flag & O_CREAT) && !shared) inode_unlock(dir->d_inode); else inode_unlock_shared(dir->d_inode); @@ -3712,41 +3831,29 @@ struct file *do_file_open_root(const struct path *root, return file; } -static struct dentry *filename_create(int dfd, struct filename *name, - struct path *path, unsigned int lookup_flags) +static struct dentry *filename_create_one(struct qstr *last, struct path *path, + unsigned int lookup_flags, + wait_queue_head_t *wq) { - struct dentry *dentry = ERR_PTR(-EEXIST); - struct qstr last; + struct dentry *dentry; bool want_dir = lookup_flags & LOOKUP_DIRECTORY; unsigned int reval_flag = lookup_flags & LOOKUP_REVAL; unsigned int create_flags = LOOKUP_CREATE | LOOKUP_EXCL; - int type; int err2; int error; - error = filename_parentat(dfd, name, reval_flag, path, &last, &type); - if (error) - return ERR_PTR(error); - - /* - * Yucky last component or no last component at all? - * (foo/., foo/.., /////) - */ - if (unlikely(type != LAST_NORM)) - goto out; - /* don't fail immediately if it's r/o, at least try to report other errors */ err2 = mnt_want_write(path->mnt); /* * Do the final lookup. Suppress 'create' if there is a trailing * '/', and a directory wasn't requested. */ - if (last.name[last.len] && !want_dir) + if (last->name[last->len] && !want_dir) create_flags = 0; - inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); - dentry = __lookup_hash(&last, path->dentry, reval_flag | create_flags); + dentry = lookup_hash_update(last, path->dentry, + reval_flag | create_flags, wq); if (IS_ERR(dentry)) - goto unlock; + goto drop_write; error = -EEXIST; if (d_is_positive(dentry)) @@ -3768,14 +3875,56 @@ static struct dentry *filename_create(int dfd, struct filename *name, } return dentry; fail: + done_path_update(path, dentry, wq); dput(dentry); dentry = ERR_PTR(error); -unlock: - inode_unlock(path->dentry->d_inode); +drop_write: if (!err2) mnt_drop_write(path->mnt); + return dentry; +} + +struct dentry *filename_create_one_len(const char *name, int nlen, + struct path *path, + unsigned int lookup_flags, + wait_queue_head_t *wq) +{ + struct qstr this; + int err; + + err = lookup_one_common(mnt_user_ns(path->mnt), name, + path->dentry, nlen, &this); + if (err) + return ERR_PTR(err); + return filename_create_one(&this, path, lookup_flags, wq); +} +EXPORT_SYMBOL(filename_create_one_len); + +static struct dentry *filename_create(int dfd, struct filename *name, + struct path *path, unsigned int lookup_flags, + wait_queue_head_t *wq) +{ + struct dentry *dentry = ERR_PTR(-EEXIST); + struct qstr last; + unsigned int reval_flag = lookup_flags & LOOKUP_REVAL; + int type; + int error; + + error = filename_parentat(dfd, name, reval_flag, path, &last, &type); + if (error) + return ERR_PTR(error); + + /* + * Yucky last component or no last component at all? + * (foo/., foo/.., /////) + */ + if (unlikely(type != LAST_NORM)) + goto out; + + dentry = filename_create_one(&last, path, lookup_flags, wq); out: - path_put(path); + if (IS_ERR(dentry)) + path_put(path); return dentry; } @@ -3783,27 +3932,29 @@ struct dentry *kern_path_create(int dfd, const char *pathname, struct path *path, unsigned int lookup_flags) { struct filename *filename = getname_kernel(pathname); - struct dentry *res = filename_create(dfd, filename, path, lookup_flags); + struct dentry *res = filename_create(dfd, filename, path, lookup_flags, + NULL); putname(filename); return res; } EXPORT_SYMBOL(kern_path_create); -void done_path_create(struct path *path, struct dentry *dentry) +void done_path_create_wq(struct path *path, struct dentry *dentry, + wait_queue_head_t *wq) { + done_path_update(path, dentry, wq); dput(dentry); - inode_unlock(path->dentry->d_inode); mnt_drop_write(path->mnt); path_put(path); } -EXPORT_SYMBOL(done_path_create); +EXPORT_SYMBOL(done_path_create_wq); inline struct dentry *user_path_create(int dfd, const char __user *pathname, - struct path *path, unsigned int lookup_flags) + struct path *path, unsigned int lookup_flags) { struct filename *filename = getname(pathname); - struct dentry *res = filename_create(dfd, filename, path, lookup_flags); + struct dentry *res = filename_create(dfd, filename, path, lookup_flags, NULL); putname(filename); return res; @@ -3882,12 +4033,13 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode, struct path path; int error; unsigned int lookup_flags = 0; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); error = may_mknod(mode); if (error) goto out1; retry: - dentry = filename_create(dfd, name, &path, lookup_flags); + dentry = filename_create(dfd, name, &path, lookup_flags, &wq); error = PTR_ERR(dentry); if (IS_ERR(dentry)) goto out1; @@ -3916,7 +4068,7 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode, break; } out2: - done_path_create(&path, dentry); + done_path_create_wq(&path, dentry, &wq); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; @@ -3985,9 +4137,10 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode) struct path path; int error; unsigned int lookup_flags = LOOKUP_DIRECTORY; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); retry: - dentry = filename_create(dfd, name, &path, lookup_flags); + dentry = filename_create(dfd, name, &path, lookup_flags, &wq); error = PTR_ERR(dentry); if (IS_ERR(dentry)) goto out_putname; @@ -4001,7 +4154,7 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode) error = vfs_mkdir(mnt_userns, path.dentry->d_inode, dentry, mode); } - done_path_create(&path, dentry); + done_path_create_wq(&path, dentry, &wq); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; @@ -4085,6 +4238,7 @@ int do_rmdir(int dfd, struct filename *name) struct qstr last; int type; unsigned int lookup_flags = 0; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); retry: error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type); if (error) @@ -4106,8 +4260,7 @@ int do_rmdir(int dfd, struct filename *name) if (error) goto exit2; - inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT); - dentry = __lookup_hash(&last, path.dentry, lookup_flags); + dentry = lookup_hash_update(&last, path.dentry, lookup_flags, &wq); error = PTR_ERR(dentry); if (IS_ERR(dentry)) goto exit3; @@ -4121,9 +4274,9 @@ int do_rmdir(int dfd, struct filename *name) mnt_userns = mnt_user_ns(path.mnt); error = vfs_rmdir(mnt_userns, path.dentry->d_inode, dentry); exit4: + done_path_update(&path, dentry, &wq); dput(dentry); exit3: - inode_unlock(path.dentry->d_inode); mnt_drop_write(path.mnt); exit2: path_put(&path); @@ -4148,13 +4301,14 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname) * @dentry: victim * @delegated_inode: returns victim inode, if the inode is delegated. * - * The caller must hold dir->i_mutex. + * The caller must either hold a write-lock on dir->i_rwsem, or + * a read-lock having atomically set DCACHE_PAR_UPDATE. * * If vfs_unlink discovers a delegation, it will return -EWOULDBLOCK and * return a reference to the inode in delegated_inode. The caller * should then break the delegation on that inode and retry. Because * breaking a delegation may take a long time, the caller should drop - * dir->i_mutex before doing so. + * dir->i_rwsem before doing so. * * Alternatively, a caller may pass NULL for delegated_inode. This may * be appropriate for callers that expect the underlying filesystem not @@ -4213,9 +4367,11 @@ EXPORT_SYMBOL(vfs_unlink); /* * Make sure that the actual truncation of the file will occur outside its - * directory's i_mutex. Truncate can take a long time if there is a lot of + * directory's i_rwsem. Truncate can take a long time if there is a lot of * writeout happening, and we don't want to prevent access to the directory * while waiting on the I/O. + * If the directory permits (S_PAR_UPDATE), we take a shared lock on the + * directory DCACHE_PAR_UPDATE to get exclusive access to the dentry. */ int do_unlinkat(int dfd, struct filename *name) { @@ -4227,6 +4383,7 @@ int do_unlinkat(int dfd, struct filename *name) struct inode *inode = NULL; struct inode *delegated_inode = NULL; unsigned int lookup_flags = 0; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); retry: error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type); if (error) @@ -4239,9 +4396,9 @@ int do_unlinkat(int dfd, struct filename *name) error = mnt_want_write(path.mnt); if (error) goto exit2; + retry_deleg: - inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT); - dentry = __lookup_hash(&last, path.dentry, lookup_flags); + dentry = lookup_hash_update(&last, path.dentry, lookup_flags, &wq); error = PTR_ERR(dentry); if (!IS_ERR(dentry)) { struct user_namespace *mnt_userns; @@ -4260,9 +4417,9 @@ int do_unlinkat(int dfd, struct filename *name) error = vfs_unlink(mnt_userns, path.dentry->d_inode, dentry, &delegated_inode); exit3: + done_path_update(&path, dentry, &wq); dput(dentry); } - inode_unlock(path.dentry->d_inode); if (inode) iput(inode); /* truncate the inode here */ inode = NULL; @@ -4351,13 +4508,14 @@ int do_symlinkat(struct filename *from, int newdfd, struct filename *to) struct dentry *dentry; struct path path; unsigned int lookup_flags = 0; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); if (IS_ERR(from)) { error = PTR_ERR(from); goto out_putnames; } retry: - dentry = filename_create(newdfd, to, &path, lookup_flags); + dentry = filename_create(newdfd, to, &path, lookup_flags, &wq); error = PTR_ERR(dentry); if (IS_ERR(dentry)) goto out_putnames; @@ -4370,7 +4528,7 @@ int do_symlinkat(struct filename *from, int newdfd, struct filename *to) error = vfs_symlink(mnt_userns, path.dentry->d_inode, dentry, from->name); } - done_path_create(&path, dentry); + done_path_create_wq(&path, dentry, &wq); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; @@ -4499,6 +4657,7 @@ int do_linkat(int olddfd, struct filename *old, int newdfd, struct inode *delegated_inode = NULL; int how = 0; int error; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); if ((flags & ~(AT_SYMLINK_FOLLOW | AT_EMPTY_PATH)) != 0) { error = -EINVAL; @@ -4522,7 +4681,7 @@ int do_linkat(int olddfd, struct filename *old, int newdfd, goto out_putnames; new_dentry = filename_create(newdfd, new, &new_path, - (how & LOOKUP_REVAL)); + (how & LOOKUP_REVAL), &wq); error = PTR_ERR(new_dentry); if (IS_ERR(new_dentry)) goto out_putpath; @@ -4540,7 +4699,7 @@ int do_linkat(int olddfd, struct filename *old, int newdfd, error = vfs_link(old_path.dentry, mnt_userns, new_path.dentry->d_inode, new_dentry, &delegated_inode); out_dput: - done_path_create(&new_path, new_dentry); + done_path_create_wq(&new_path, new_dentry, &wq); if (delegated_inode) { error = break_deleg_wait(&delegated_inode); if (!error) { @@ -4810,7 +4969,8 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, retry_deleg: trap = lock_rename(new_path.dentry, old_path.dentry); - old_dentry = __lookup_hash(&old_last, old_path.dentry, lookup_flags); + old_dentry = __lookup_hash(&old_last, old_path.dentry, + lookup_flags, NULL); error = PTR_ERR(old_dentry); if (IS_ERR(old_dentry)) goto exit3; @@ -4818,7 +4978,8 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, error = -ENOENT; if (d_is_negative(old_dentry)) goto exit4; - new_dentry = __lookup_hash(&new_last, new_path.dentry, lookup_flags | target_flags); + new_dentry = __lookup_hash(&new_last, new_path.dentry, + lookup_flags | target_flags, NULL); error = PTR_ERR(new_dentry); if (IS_ERR(new_dentry)) goto exit4; diff --git a/include/linux/dcache.h b/include/linux/dcache.h index f5bba51480b2..6331cf4ac87a 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -13,7 +13,9 @@ #include #include #include +#include #include +#include struct path; struct vfsmount; @@ -96,6 +98,8 @@ struct dentry { unsigned long d_time; /* used by d_revalidate */ void *d_fsdata; /* fs-specific data */ + /* lockdep tracking of DCACHE_PAR_UPDATE locks */ + struct lockdep_map d_update_map; union { struct list_head d_lru; /* LRU list */ wait_queue_head_t *d_wait; /* in-lookup ones only */ @@ -211,6 +215,7 @@ struct dentry_operations { #define DCACHE_PAR_LOOKUP 0x10000000 /* being looked up (with parent locked shared) */ #define DCACHE_DENTRY_CURSOR 0x20000000 #define DCACHE_NORCU 0x40000000 /* No RCU delay for freeing */ +#define DCACHE_PAR_UPDATE 0x80000000 /* Being created or unlinked with shared lock */ extern seqlock_t rename_lock; @@ -599,4 +604,26 @@ struct name_snapshot { void take_dentry_name_snapshot(struct name_snapshot *, struct dentry *); void release_dentry_name_snapshot(struct name_snapshot *); +bool d_lock_update_nested(struct dentry *dentry, + struct dentry *base, const struct qstr *name, + int subclass); +static inline bool d_lock_update(struct dentry *dentry, + struct dentry *base, const struct qstr *name) +{ + return d_lock_update_nested(dentry, base, name, 0); +} + +static inline void d_unlock_update(struct dentry *dentry) +{ + if (IS_ERR_OR_NULL(dentry)) + return; + if (dentry->d_flags & DCACHE_PAR_UPDATE) { + lock_map_release(&dentry->d_update_map); + spin_lock(&dentry->d_lock); + dentry->d_flags &= ~DCACHE_PAR_UPDATE; + spin_unlock(&dentry->d_lock); + wake_up_var(&dentry->d_flags); + } +} + #endif /* __LINUX_DCACHE_H */ diff --git a/include/linux/fs.h b/include/linux/fs.h index 9ad5e3520fae..96a2a23927e1 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2146,6 +2146,7 @@ struct super_operations { #define S_CASEFOLD (1 << 15) /* Casefolded file */ #define S_VERITY (1 << 16) /* Verity file (using fs/verity/) */ #define S_KERNEL_FILE (1 << 17) /* File is in use by the kernel (eg. fs/cachefiles) */ +#define S_PAR_UPDATE (1 << 18) /* Parallel directory operations supported */ /* * Note that nosuid etc flags are inode-specific: setting some file-system diff --git a/include/linux/namei.h b/include/linux/namei.h index caeb08a98536..f75c6639dd1a 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -61,7 +61,14 @@ extern int kern_path(const char *, unsigned, struct path *); extern struct dentry *kern_path_create(int, const char *, struct path *, unsigned int); extern struct dentry *user_path_create(int, const char __user *, struct path *, unsigned int); -extern void done_path_create(struct path *, struct dentry *); +extern struct dentry *lookup_hash_update_len(const char *name, int nlen, + struct path *path, unsigned int flags, + wait_queue_head_t *wq); +extern void done_path_create_wq(struct path *, struct dentry *, wait_queue_head_t *wq); +static inline void done_path_create(struct path *path, struct dentry *dentry) +{ + done_path_create_wq(path, dentry, NULL); +} extern struct dentry *kern_path_locked(const char *, struct path *); extern struct dentry *try_lookup_one_len(const char *, struct dentry *, int); @@ -75,6 +82,10 @@ struct dentry *lookup_one_unlocked(struct user_namespace *mnt_userns, struct dentry *lookup_one_positive_unlocked(struct user_namespace *mnt_userns, const char *name, struct dentry *base, int len); +struct dentry *filename_create_one_len(const char *name, int nlen, + struct path *path, + unsigned int lookup_flags, + wait_queue_head_t *wq); extern int follow_down_one(struct path *); extern int follow_down(struct path *); From patchwork Mon Jun 13 23:18:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12880299 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61EC7CCA47B for ; Mon, 13 Jun 2022 23:20:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235333AbiFMXUU (ORCPT ); Mon, 13 Jun 2022 19:20:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235127AbiFMXUO (ORCPT ); Mon, 13 Jun 2022 19:20:14 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F246EDF05; Mon, 13 Jun 2022 16:20:12 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 7BDA11F959; Mon, 13 Jun 2022 23:20:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1655162411; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=82DjfRBW99HvcPcA756zHmAdDUvmWlAbNLkjBBig1RU=; b=onyQo7w/F0j7JWAstX/Wf8oR9psNhx1SG23FAe2QRj3Dl8qx03/yLvA+O/XM7M23wHrmOc pH9zLJIvsBz8c0/ne+N1laY26Y9dnQeJYqQVXG5DPaVfQvXpO+ZTMSnpO+QUUHJ+xywWBs NBk6PDAZBPlHpGuLznAgyxxzCp4qkTw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1655162411; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=82DjfRBW99HvcPcA756zHmAdDUvmWlAbNLkjBBig1RU=; b=bKil98UTCl04ew4EmOoqlArjuqiVuARFlR9KJdTnnLaClKo9d1GorIumVTGQ7m9E2oobVn T20lIhtIhAndPqDw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 0DDE2134CF; Mon, 13 Jun 2022 23:20:07 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id kkCnLifGp2K1bwAAMHmgww (envelope-from ); Mon, 13 Jun 2022 23:20:07 +0000 Subject: [PATCH 02/12] VFS: move EEXIST and ENOENT tests into lookup_hash_update() From: NeilBrown To: Al Viro , Daire Byrne , Trond Myklebust , Chuck Lever Cc: Linux NFS Mailing List , linux-fsdevel@vger.kernel.org, LKML Date: Tue, 14 Jun 2022 09:18:21 +1000 Message-ID: <165516230197.21248.5856192432755210577.stgit@noble.brown> In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown> References: <165516173293.21248.14587048046993234326.stgit@noble.brown> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Moving common error handling into lookup_hash_update() simplifies callers. A future patch will export this functionality to nfsd, and the more code we put in the interface, the less code will be needed in nfsd. Signed-off-by: NeilBrown --- fs/namei.c | 55 +++++++++++++++++++++++++++++++------------------------ 1 file changed, 31 insertions(+), 24 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 9a2ca515c219..e7a56d6e2472 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1563,7 +1563,13 @@ static struct dentry *lookup_dcache(const struct qstr *name, { struct dentry *dentry = d_lookup(dir, name); if (dentry) { - int error = d_revalidate(dentry, flags); + int error; + /* Some filesystems assume EXCL -> CREATE, so make + * sure it does. + */ + if (!(flags & LOOKUP_CREATE)) + flags &= ~LOOKUP_EXCL; + error = d_revalidate(dentry, flags); if (unlikely(error <= 0)) { if (!error) d_invalidate(dentry); @@ -1621,6 +1627,8 @@ static struct dentry *__lookup_hash(const struct qstr *name, * or shared lock depending on the fs preference, then do a lookup, * and then set the DCACHE_PAR_UPDATE bit on the child if a shared lock * was taken on the parent. + * If LOOKUP_EXCL, name should not already exist, else -EEXIST + * If not LOOKUP_CREATE, name should already exist, else -ENOENT */ static struct dentry *lookup_hash_update(const struct qstr *name, struct dentry *base, unsigned int flags, @@ -1644,6 +1652,20 @@ static struct dentry *lookup_hash_update(const struct qstr *name, err = PTR_ERR(dentry); goto out_err; } + if (flags & LOOKUP_EXCL) { + if (d_is_positive(dentry)) { + dput(dentry); + err = -EEXIST; + goto out_err; + } + } + if (!(flags & LOOKUP_CREATE)) { + if (!dentry->d_inode) { + dput(dentry); + err = -ENOENT; + goto out_err; + } + } if (wq && !d_lock_update(dentry, base, name)) { /* * Failed to get lock due to race with unlink or rename @@ -3838,7 +3860,7 @@ static struct dentry *filename_create_one(struct qstr *last, struct path *path, struct dentry *dentry; bool want_dir = lookup_flags & LOOKUP_DIRECTORY; unsigned int reval_flag = lookup_flags & LOOKUP_REVAL; - unsigned int create_flags = LOOKUP_CREATE | LOOKUP_EXCL; + unsigned int create_flag = LOOKUP_CREATE; int err2; int error; @@ -3849,26 +3871,17 @@ static struct dentry *filename_create_one(struct qstr *last, struct path *path, * '/', and a directory wasn't requested. */ if (last->name[last->len] && !want_dir) - create_flags = 0; + /* Name was foo/bar/ but not creating a directory, so + * we won't try to create - result with be either -ENOENT + * or -EEXIST. + */ + create_flag = 0; dentry = lookup_hash_update(last, path->dentry, - reval_flag | create_flags, wq); + reval_flag | create_flag | LOOKUP_EXCL, + wq); if (IS_ERR(dentry)) goto drop_write; - error = -EEXIST; - if (d_is_positive(dentry)) - goto fail; - - /* - * Special case - lookup gave negative, but... we had foo/bar/ - * From the vfs_mknod() POV we just have a negative dentry - - * all is fine. Let's be bastards - you had / on the end, you've - * been asking for (non-existent) directory. -ENOENT for you. - */ - if (unlikely(!create_flags)) { - error = -ENOENT; - goto fail; - } if (unlikely(err2)) { error = err2; goto fail; @@ -4264,10 +4277,6 @@ int do_rmdir(int dfd, struct filename *name) error = PTR_ERR(dentry); if (IS_ERR(dentry)) goto exit3; - if (!dentry->d_inode) { - error = -ENOENT; - goto exit4; - } error = security_path_rmdir(&path, dentry); if (error) goto exit4; @@ -4407,8 +4416,6 @@ int do_unlinkat(int dfd, struct filename *name) if (last.name[last.len]) goto slashes; inode = dentry->d_inode; - if (d_is_negative(dentry)) - goto slashes; ihold(inode); error = security_path_unlink(&path, dentry); if (error) From patchwork Mon Jun 13 23:18:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12880300 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E442C433EF for ; Mon, 13 Jun 2022 23:20:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236807AbiFMXUb (ORCPT ); Mon, 13 Jun 2022 19:20:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236174AbiFMXU0 (ORCPT ); Mon, 13 Jun 2022 19:20:26 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 83B1E17A97; Mon, 13 Jun 2022 16:20:24 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 3E0691F959; Mon, 13 Jun 2022 23:20:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1655162423; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=49sNL+JXZC0xvhRBhT/jrprRMvF1g4wqNINkvpdWU1E=; b=BolQoJ8ewEmgzdSsf/nS6bIY0HzmeYMtTc6jYUn8MwSSvXnNCikGuGkicNulbJu03KlbcR rWsm5NRyuoz86HDX3aUI+HZfT3JilqqIFfwIcr0egwaOX1H8Za9MzoEUZyOaCebILatGbG HklNAU2MoqjOOQaz7u5t7JdPcFKH8xY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1655162423; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=49sNL+JXZC0xvhRBhT/jrprRMvF1g4wqNINkvpdWU1E=; b=m29J4WRcV1TkrUSuKc7s9YsqQYWcxjn8kaNGPA9SGHmYTOkHVMwFJyPa08FE243okQ2MYf TH8vtYk/7pYV4EDw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 07989134CF; Mon, 13 Jun 2022 23:20:20 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 2c4QLTTGp2K5bwAAMHmgww (envelope-from ); Mon, 13 Jun 2022 23:20:20 +0000 Subject: [PATCH 03/12] VFS: move want_write checks into lookup_hash_update() From: NeilBrown To: Al Viro , Daire Byrne , Trond Myklebust , Chuck Lever Cc: Linux NFS Mailing List , linux-fsdevel@vger.kernel.org, LKML Date: Tue, 14 Jun 2022 09:18:21 +1000 Message-ID: <165516230197.21248.171443631120298429.stgit@noble.brown> In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown> References: <165516173293.21248.14587048046993234326.stgit@noble.brown> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org mnt_want_write() is always called before lookup_hash_update(), so we can simplify the code by moving the call into that function. If lookup_hash_update() succeeds, it now will have claimed the want_write lock. If it fails, the want_write lock isn't held. Also lookup_hash_update() now receives a 'struct path'. Note that when creating a name, any error from mnt_want_write() does not get reported unless there is no other error. For unlink/rmdir though, an error from mnt_want_write() is immediately fatal - overriding ENOENT for example. This behaviour seems strange, but this patch is careful to preserve it. Signed-off-by: NeilBrown --- fs/namei.c | 71 +++++++++++++++++++++++++----------------------------------- 1 file changed, 29 insertions(+), 42 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index e7a56d6e2472..83ce2f7083be 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1631,12 +1631,22 @@ static struct dentry *__lookup_hash(const struct qstr *name, * If not LOOKUP_CREATE, name should already exist, else -ENOENT */ static struct dentry *lookup_hash_update(const struct qstr *name, - struct dentry *base, unsigned int flags, + struct path *path, unsigned int flags, wait_queue_head_t *wq) { struct dentry *dentry; + struct dentry *base = path->dentry; struct inode *dir = base->d_inode; - int err; + int err, err2; + + /* For create, don't fail immediately if it's r/o, + * at least try to report other errors. + * For unlink/rmdir where LOOKUP_REVAl is the only + * flag, fail immediately if r/o. + */ + err2 = mnt_want_write(path->mnt); + if (err2 && (flags & ~LOOKUP_REVAL) == 0) + return ERR_PTR(err2); if (!(dir->i_flags & S_PAR_UPDATE)) wq = NULL; @@ -1675,6 +1685,11 @@ static struct dentry *lookup_hash_update(const struct qstr *name, dput(dentry); goto retry; } + if (err2) { + err = err2; + dput(dentry); + goto out_err; + } return dentry; out_err: @@ -1682,6 +1697,8 @@ static struct dentry *lookup_hash_update(const struct qstr *name, inode_unlock_shared(dir); else inode_unlock(dir); + if (!err2) + mnt_drop_write(path->mnt); return ERR_PTR(err); } @@ -1698,7 +1715,7 @@ struct dentry *lookup_hash_update_len(const char *name, int nlen, path->dentry, nlen, &this); if (err) return ERR_PTR(err); - return lookup_hash_update(&this, path->dentry, flags, wq); + return lookup_hash_update(&this, path, flags, wq); } EXPORT_SYMBOL(lookup_hash_update_len); @@ -3857,15 +3874,10 @@ static struct dentry *filename_create_one(struct qstr *last, struct path *path, unsigned int lookup_flags, wait_queue_head_t *wq) { - struct dentry *dentry; bool want_dir = lookup_flags & LOOKUP_DIRECTORY; unsigned int reval_flag = lookup_flags & LOOKUP_REVAL; unsigned int create_flag = LOOKUP_CREATE; - int err2; - int error; - /* don't fail immediately if it's r/o, at least try to report other errors */ - err2 = mnt_want_write(path->mnt); /* * Do the final lookup. Suppress 'create' if there is a trailing * '/', and a directory wasn't requested. @@ -3876,25 +3888,9 @@ static struct dentry *filename_create_one(struct qstr *last, struct path *path, * or -EEXIST. */ create_flag = 0; - dentry = lookup_hash_update(last, path->dentry, - reval_flag | create_flag | LOOKUP_EXCL, - wq); - if (IS_ERR(dentry)) - goto drop_write; - - if (unlikely(err2)) { - error = err2; - goto fail; - } - return dentry; -fail: - done_path_update(path, dentry, wq); - dput(dentry); - dentry = ERR_PTR(error); -drop_write: - if (!err2) - mnt_drop_write(path->mnt); - return dentry; + return lookup_hash_update(last, path, + reval_flag | create_flag | LOOKUP_EXCL, + wq); } struct dentry *filename_create_one_len(const char *name, int nlen, @@ -4269,23 +4265,18 @@ int do_rmdir(int dfd, struct filename *name) goto exit2; } - error = mnt_want_write(path.mnt); - if (error) - goto exit2; - - dentry = lookup_hash_update(&last, path.dentry, lookup_flags, &wq); + dentry = lookup_hash_update(&last, &path, lookup_flags, &wq); error = PTR_ERR(dentry); if (IS_ERR(dentry)) - goto exit3; + goto exit2; error = security_path_rmdir(&path, dentry); if (error) - goto exit4; + goto exit3; mnt_userns = mnt_user_ns(path.mnt); error = vfs_rmdir(mnt_userns, path.dentry->d_inode, dentry); -exit4: +exit3: done_path_update(&path, dentry, &wq); dput(dentry); -exit3: mnt_drop_write(path.mnt); exit2: path_put(&path); @@ -4402,12 +4393,8 @@ int do_unlinkat(int dfd, struct filename *name) if (type != LAST_NORM) goto exit2; - error = mnt_want_write(path.mnt); - if (error) - goto exit2; - retry_deleg: - dentry = lookup_hash_update(&last, path.dentry, lookup_flags, &wq); + dentry = lookup_hash_update(&last, &path, lookup_flags, &wq); error = PTR_ERR(dentry); if (!IS_ERR(dentry)) { struct user_namespace *mnt_userns; @@ -4426,6 +4413,7 @@ int do_unlinkat(int dfd, struct filename *name) exit3: done_path_update(&path, dentry, &wq); dput(dentry); + mnt_drop_write(path.mnt); } if (inode) iput(inode); /* truncate the inode here */ @@ -4435,7 +4423,6 @@ int do_unlinkat(int dfd, struct filename *name) if (!error) goto retry_deleg; } - mnt_drop_write(path.mnt); exit2: path_put(&path); if (retry_estale(error, lookup_flags)) { From patchwork Mon Jun 13 23:18:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12880301 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A3F3C433EF for ; Mon, 13 Jun 2022 23:20:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238745AbiFMXUp (ORCPT ); Mon, 13 Jun 2022 19:20:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236428AbiFMXUn (ORCPT ); Mon, 13 Jun 2022 19:20:43 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8AB5B2983D; Mon, 13 Jun 2022 16:20:39 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 34EB11F959; Mon, 13 Jun 2022 23:20:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1655162438; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dv0t/moe0oo2g4yNdVPUDgVdwdMBKpyu/x9jlXnCUWQ=; b=gat8BNdlwOtlhRNhKV7bDO0aFl07I0a2ImRNHxKdZm/OlKMUeP89eD7Vy0zTlZzn+fswIW BJBM3zg3jTWoZsgs0i5+oizCj6FvQ82vjWdHWLy/CPZ1lH+r529KPtVc9r/6gNiX2UlryB en50mmh+nZQeoojQnNnd+KiTbC01BgA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1655162438; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dv0t/moe0oo2g4yNdVPUDgVdwdMBKpyu/x9jlXnCUWQ=; b=C0BcQJ7jOn+AVLULjJbrc70KgLqI8xMLcwgkLYm31PuvYVhtEl0PGgF4leaYNeOmCmrwuM E+UlmOtqwBVkhhAA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id E824A134CF; Mon, 13 Jun 2022 23:20:35 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id gk8JKEPGp2LNbwAAMHmgww (envelope-from ); Mon, 13 Jun 2022 23:20:35 +0000 Subject: [PATCH 04/12] VFS: move dput() and mnt_drop_write() into done_path_update() From: NeilBrown To: Al Viro , Daire Byrne , Trond Myklebust , Chuck Lever Cc: Linux NFS Mailing List , linux-fsdevel@vger.kernel.org, LKML Date: Tue, 14 Jun 2022 09:18:21 +1000 Message-ID: <165516230198.21248.8276913837756883567.stgit@noble.brown> In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown> References: <165516173293.21248.14587048046993234326.stgit@noble.brown> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org All calls to done_path_update() are followed by the same two calls, so merge them in. Signed-off-by: NeilBrown --- fs/namei.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 83ce2f7083be..f13bff877e30 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1732,6 +1732,8 @@ static void done_path_update(struct path *path, struct dentry *dentry, } else { inode_unlock(dir); } + dput(dentry); + mnt_drop_write(path->mnt); } static struct dentry *lookup_fast(struct nameidata *nd, @@ -3953,8 +3955,6 @@ void done_path_create_wq(struct path *path, struct dentry *dentry, wait_queue_head_t *wq) { done_path_update(path, dentry, wq); - dput(dentry); - mnt_drop_write(path->mnt); path_put(path); } EXPORT_SYMBOL(done_path_create_wq); @@ -4276,8 +4276,6 @@ int do_rmdir(int dfd, struct filename *name) error = vfs_rmdir(mnt_userns, path.dentry->d_inode, dentry); exit3: done_path_update(&path, dentry, &wq); - dput(dentry); - mnt_drop_write(path.mnt); exit2: path_put(&path); if (retry_estale(error, lookup_flags)) { @@ -4412,8 +4410,6 @@ int do_unlinkat(int dfd, struct filename *name) &delegated_inode); exit3: done_path_update(&path, dentry, &wq); - dput(dentry); - mnt_drop_write(path.mnt); } if (inode) iput(inode); /* truncate the inode here */ From patchwork Mon Jun 13 23:18:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12880302 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CAF9CCA47B for ; Mon, 13 Jun 2022 23:20:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239465AbiFMXUs (ORCPT ); Mon, 13 Jun 2022 19:20:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237654AbiFMXUq (ORCPT ); Mon, 13 Jun 2022 19:20:46 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5C1A52B18B; Mon, 13 Jun 2022 16:20:45 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 1359221AA0; Mon, 13 Jun 2022 23:20:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1655162444; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0MtfIrHZ/Ov+xd8eT4mDLOzkjS/G6SbIerKz5jiAI3I=; b=emhdbCNAOOE1/IwfamybQdjwRm6qZn705F8LzcQ4TGJWtMsWORR1RV8wvdI8R8UBwg+nqt eK5+M5UOZer6S5B8g2ga1SnAxggMidfD8MMxQpXfG++i5tF2QWh3rFD6OCqoSLfwIanldA jnosA1MI1CsarnyZzyJkV+EE52Oo9Vs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1655162444; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0MtfIrHZ/Ov+xd8eT4mDLOzkjS/G6SbIerKz5jiAI3I=; b=7PhoTejWCCKC2GX1e1gWHKvG7tRdUbGbaSU82YIgswwoD+B5NXPg1o1E6WArermrtNdijf nJlbv0/YcU2Yl0CQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id D4033134CF; Mon, 13 Jun 2022 23:20:41 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id NEFuI0nGp2LXbwAAMHmgww (envelope-from ); Mon, 13 Jun 2022 23:20:41 +0000 Subject: [PATCH 05/12] VFS: export done_path_update() From: NeilBrown To: Al Viro , Daire Byrne , Trond Myklebust , Chuck Lever Cc: Linux NFS Mailing List , linux-fsdevel@vger.kernel.org, LKML Date: Tue, 14 Jun 2022 09:18:21 +1000 Message-ID: <165516230198.21248.8675541854979948509.stgit@noble.brown> In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown> References: <165516173293.21248.14587048046993234326.stgit@noble.brown> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This function will be be useful to nfsd, so export it. done_path_create_wq() is now simple enough to be "static inline" rather than an explicit export. Signed-off-by: NeilBrown --- fs/namei.c | 11 ++--------- include/linux/namei.h | 10 +++++++++- 2 files changed, 11 insertions(+), 10 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index f13bff877e30..8ce7aa16b704 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1719,7 +1719,7 @@ struct dentry *lookup_hash_update_len(const char *name, int nlen, } EXPORT_SYMBOL(lookup_hash_update_len); -static void done_path_update(struct path *path, struct dentry *dentry, +void done_path_update(struct path *path, struct dentry *dentry, wait_queue_head_t *wq) { struct inode *dir = path->dentry->d_inode; @@ -1735,6 +1735,7 @@ static void done_path_update(struct path *path, struct dentry *dentry, dput(dentry); mnt_drop_write(path->mnt); } +EXPORT_SYMBOL(done_path_update); static struct dentry *lookup_fast(struct nameidata *nd, struct inode **inode, @@ -3951,14 +3952,6 @@ struct dentry *kern_path_create(int dfd, const char *pathname, } EXPORT_SYMBOL(kern_path_create); -void done_path_create_wq(struct path *path, struct dentry *dentry, - wait_queue_head_t *wq) -{ - done_path_update(path, dentry, wq); - path_put(path); -} -EXPORT_SYMBOL(done_path_create_wq); - inline struct dentry *user_path_create(int dfd, const char __user *pathname, struct path *path, unsigned int lookup_flags) { diff --git a/include/linux/namei.h b/include/linux/namei.h index f75c6639dd1a..217aa6de9f25 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -64,11 +64,19 @@ extern struct dentry *user_path_create(int, const char __user *, struct path *, extern struct dentry *lookup_hash_update_len(const char *name, int nlen, struct path *path, unsigned int flags, wait_queue_head_t *wq); -extern void done_path_create_wq(struct path *, struct dentry *, wait_queue_head_t *wq); +extern void done_path_update(struct path *, struct dentry *, wait_queue_head_t *); +static inline void done_path_create_wq(struct path *path, struct dentry *dentry, + wait_queue_head_t *wq) +{ + done_path_update(path, dentry, wq); + path_put(path); +} + static inline void done_path_create(struct path *path, struct dentry *dentry) { done_path_create_wq(path, dentry, NULL); } + extern struct dentry *kern_path_locked(const char *, struct path *); extern struct dentry *try_lookup_one_len(const char *, struct dentry *, int); From patchwork Mon Jun 13 23:18:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12880303 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C9B5C433EF for ; Mon, 13 Jun 2022 23:21:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238421AbiFMXVF (ORCPT ); Mon, 13 Jun 2022 19:21:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241744AbiFMXVA (ORCPT ); Mon, 13 Jun 2022 19:21:00 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 502892E6AE; Mon, 13 Jun 2022 16:20:51 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 9D09B21AA0; Mon, 13 Jun 2022 23:20:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1655162449; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Jlt/wINlzSS0GRxWVBSPo25fgSIy4XUJj5J1aoakxuk=; b=Nh+91C5OTCqidSTSzCgGzRIS4INkPneU9bPgOhEx3qexdGmu5ZvtqVCZPlyyAopcKxmAdP jdgUCEc+lzqsCFxKvjksRpIYCVKcXt1NOjoPCD4yvjqASeyzpphDDlUcF9qe5bMha7XBHz SOYHncyCFS42xccsdLT++XuLLubrTUI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1655162449; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Jlt/wINlzSS0GRxWVBSPo25fgSIy4XUJj5J1aoakxuk=; b=tX855rtC2qyuNJsSkTWZXqzUHJdQKOIAKwCH6vJ8wHbIDAJpYov+WUXIDQ2AH8s5fidgky dv4rJ5wO0rgFWiCA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 6B2E6134CF; Mon, 13 Jun 2022 23:20:47 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id ufMOCk/Gp2LebwAAMHmgww (envelope-from ); Mon, 13 Jun 2022 23:20:47 +0000 Subject: [PATCH 06/12] VFS: support concurrent renames. From: NeilBrown To: Al Viro , Daire Byrne , Trond Myklebust , Chuck Lever Cc: Linux NFS Mailing List , linux-fsdevel@vger.kernel.org, LKML Date: Tue, 14 Jun 2022 09:18:22 +1000 Message-ID: <165516230199.21248.18142980966152036732.stgit@noble.brown> In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown> References: <165516173293.21248.14587048046993234326.stgit@noble.brown> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org An object can now be renamed from or to a directory in which a create or unlink is concurrently happening. Two or more renames with the one directory can also be concurrent. There is still only one cross-directory rename permitted at a time. Signed-off-by: NeilBrown Reported-by: kernel test robot Reported-by: kernel test robot Reported-by: kernel test robot Reported-by: kernel test robot --- fs/namei.c | 230 ++++++++++++++++++++++++++++++++++++++++++++----- include/linux/namei.h | 9 ++ 2 files changed, 216 insertions(+), 23 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 8ce7aa16b704..31ba4dbedfcf 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3172,6 +3172,197 @@ void unlock_rename(struct dentry *p1, struct dentry *p2) } EXPORT_SYMBOL(unlock_rename); +struct dentry *lock_rename_lookup_excl(struct dentry *p1, struct dentry *p2, + struct dentry **d1p, struct dentry **d2p, + struct qstr *last1, struct qstr *last2, + unsigned int flags1, unsigned int flags2) +{ + struct dentry *p; + struct dentry *d1, *d2; + + if (p1 == p2) { + inode_lock_nested(p1->d_inode, I_MUTEX_PARENT); + d1 = __lookup_hash(last1, p1, flags1, NULL); + if (IS_ERR(d1)) + goto out_unlock_1; + d2 = __lookup_hash(last2, p2, flags2, NULL); + if (IS_ERR(d2)) + goto out_unlock_2; + *d1p = d1; *d2p = d2; + return NULL; + out_unlock_2: + dput(d1); + d1 = d2; + out_unlock_1: + inode_unlock(p1->d_inode); + return d1; + } + + mutex_lock(&p1->d_sb->s_vfs_rename_mutex); + + if ((p = d_ancestor(p2, p1)) != NULL) { + inode_lock_nested(p2->d_inode, I_MUTEX_PARENT); + inode_lock_nested(p1->d_inode, I_MUTEX_CHILD); + } else if ((p = d_ancestor(p1, p2)) != NULL) { + inode_lock_nested(p1->d_inode, I_MUTEX_PARENT); + inode_lock_nested(p2->d_inode, I_MUTEX_CHILD); + } else { + inode_lock_nested(p1->d_inode, I_MUTEX_PARENT); + inode_lock_nested(p2->d_inode, I_MUTEX_PARENT2); + } + d1 = __lookup_hash(last1, p1, flags1, NULL); + if (IS_ERR(d1)) + goto unlock_out_3; + d2 = __lookup_hash(last2, p2, flags2, NULL); + if (IS_ERR(d2)) + goto unlock_out_4; + + *d1p = d1; + *d2p = d2; + return p; +unlock_out_4: + dput(d1); + d1 = d2; +unlock_out_3: + inode_unlock(p1->d_inode); + inode_unlock(p2->d_inode); + mutex_unlock(&p1->d_sb->s_vfs_rename_mutex); + return d1; +} + +static struct dentry *lock_rename_lookup(struct dentry *p1, struct dentry *p2, + struct dentry **d1p, struct dentry **d2p, + struct qstr *last1, struct qstr *last2, + unsigned int flags1, unsigned int flags2, + wait_queue_head_t *wq) +{ + struct dentry *p; + struct dentry *d1, *d2; + bool ok1, ok2; + + if (!wq || (p1->d_inode->i_flags & S_PAR_UPDATE) == 0) + return lock_rename_lookup_excl(p1, p2, d1p, d2p, last1, last2, + flags1, flags2); + + if (p1 == p2) { + inode_lock_shared_nested(p1->d_inode, I_MUTEX_PARENT); + retry: + d1 = __lookup_hash(last1, p1, flags1, wq); + if (IS_ERR(d1)) + goto out_unlock_1; + d2 = __lookup_hash(last2, p2, flags2, wq); + if (IS_ERR(d2)) + goto out_unlock_2; + *d1p = d1; *d2p = d2; + + if (d1 < d2) { + ok1 = d_lock_update(d1, p1, last1); + ok2 = d_lock_update(d2, p2, last2); + } else { + ok2 = d_lock_update(d2, p2, last2); + ok1 = d_lock_update(d1, p1, last1); + } + if (!ok1 || !ok2) { + if (ok1) + d_unlock_update(d1); + if (ok2) + d_unlock_update(d2); + dput(d1); + dput(d2); + goto retry; + } + return NULL; + out_unlock_2: + dput(d1); + d1 = d2; + out_unlock_1: + inode_unlock_shared(p1->d_inode); + return d1; + } + + mutex_lock(&p1->d_sb->s_vfs_rename_mutex); + + if ((p = d_ancestor(p2, p1)) != NULL) { + inode_lock_shared_nested(p2->d_inode, I_MUTEX_PARENT); + inode_lock_shared_nested(p1->d_inode, I_MUTEX_CHILD); + } else if ((p = d_ancestor(p1, p2)) != NULL) { + inode_lock_shared_nested(p1->d_inode, I_MUTEX_PARENT); + inode_lock_shared_nested(p2->d_inode, I_MUTEX_CHILD); + } else { + inode_lock_shared_nested(p1->d_inode, I_MUTEX_PARENT); + inode_lock_shared_nested(p2->d_inode, I_MUTEX_PARENT2); + } +retry2: + d1 = __lookup_hash(last1, p1, flags1, wq); + if (IS_ERR(d1)) + goto unlock_out_3; + d2 = __lookup_hash(last2, p2, flags2, wq); + if (IS_ERR(d2)) + goto unlock_out_4; + + ok1 = d_lock_update(d1, p1, last1); + ok2 = d_lock_update(d2, p2, last2); + if (!ok1 || !ok2) { + if (ok1) + d_unlock_update(d1); + if (ok2) + d_unlock_update(d2); + dput(d1); + dput(d2); + goto retry2; + } + *d1p = d1; + *d2p = d2; + return p; +unlock_out_4: + dput(d1); + d1 = d2; +unlock_out_3: + inode_unlock_shared(p1->d_inode); + inode_unlock_shared(p2->d_inode); + mutex_unlock(&p1->d_sb->s_vfs_rename_mutex); + return d1; +} + +struct dentry *lock_rename_lookup_one(struct dentry *p1, struct dentry *p2, + struct dentry **d1p, struct dentry **d2p, + const char *name1, int nlen1, + const char *name2, int nlen2, + unsigned int flags1, unsigned int flags2, + wait_queue_head_t *wq) +{ + struct qstr this1, this2; + int err; + + err = lookup_one_common(&init_user_ns, name1, p1, nlen1, &this1); + if (err) + return ERR_PTR(err); + err = lookup_one_common(&init_user_ns, name2, p2, nlen2, &this2); + if (err) + return ERR_PTR(err); + return lock_rename_lookup(p1, p2, d1p, d2p, &this1, &this2, + flags1, flags2, wq); +} +EXPORT_SYMBOL(lock_rename_lookup_one); + +void unlock_rename_lookup(struct dentry *p1, struct dentry *p2, + struct dentry *d1, struct dentry *d2) +{ + if (d1->d_flags & DCACHE_PAR_UPDATE) { + d_lookup_done(d1); + d_lookup_done(d2); + inode_unlock_shared(p1->d_inode); + if (p1 != p2) { + inode_unlock_shared(p2->d_inode); + mutex_unlock(&p1->d_sb->s_vfs_rename_mutex); + } + } else + unlock_rename(p1, p2); + dput(d1); + dput(d2); +} +EXPORT_SYMBOL(unlock_rename_lookup); + /** * vfs_create - create new file * @mnt_userns: user namespace of the mount the inode was found from @@ -4910,6 +5101,7 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, unsigned int lookup_flags = 0, target_flags = LOOKUP_RENAME_TARGET; bool should_retry = false; int error = -EINVAL; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE | RENAME_WHITEOUT)) goto put_names; @@ -4950,58 +5142,53 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, goto exit2; retry_deleg: - trap = lock_rename(new_path.dentry, old_path.dentry); - - old_dentry = __lookup_hash(&old_last, old_path.dentry, - lookup_flags, NULL); - error = PTR_ERR(old_dentry); - if (IS_ERR(old_dentry)) + trap = lock_rename_lookup(new_path.dentry, old_path.dentry, + &new_dentry, &old_dentry, + &new_last, &old_last, + lookup_flags | target_flags, lookup_flags, + &wq); + if (IS_ERR(trap)) goto exit3; /* source must exist */ error = -ENOENT; if (d_is_negative(old_dentry)) goto exit4; - new_dentry = __lookup_hash(&new_last, new_path.dentry, - lookup_flags | target_flags, NULL); - error = PTR_ERR(new_dentry); - if (IS_ERR(new_dentry)) - goto exit4; error = -EEXIST; if ((flags & RENAME_NOREPLACE) && d_is_positive(new_dentry)) - goto exit5; + goto exit4; if (flags & RENAME_EXCHANGE) { error = -ENOENT; if (d_is_negative(new_dentry)) - goto exit5; + goto exit4; if (!d_is_dir(new_dentry)) { error = -ENOTDIR; if (new_last.name[new_last.len]) - goto exit5; + goto exit4; } } /* unless the source is a directory trailing slashes give -ENOTDIR */ if (!d_is_dir(old_dentry)) { error = -ENOTDIR; if (old_last.name[old_last.len]) - goto exit5; + goto exit4; if (!(flags & RENAME_EXCHANGE) && new_last.name[new_last.len]) - goto exit5; + goto exit4; } /* source should not be ancestor of target */ error = -EINVAL; if (old_dentry == trap) - goto exit5; + goto exit4; /* target should not be an ancestor of source */ if (!(flags & RENAME_EXCHANGE)) error = -ENOTEMPTY; if (new_dentry == trap) - goto exit5; + goto exit4; error = security_path_rename(&old_path, old_dentry, &new_path, new_dentry, flags); if (error) - goto exit5; + goto exit4; rd.old_dir = old_path.dentry->d_inode; rd.old_dentry = old_dentry; @@ -5012,12 +5199,9 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, rd.delegated_inode = &delegated_inode; rd.flags = flags; error = vfs_rename(&rd); -exit5: - dput(new_dentry); exit4: - dput(old_dentry); + unlock_rename_lookup(new_path.dentry, old_path.dentry, new_dentry, old_dentry); exit3: - unlock_rename(new_path.dentry, old_path.dentry); if (delegated_inode) { error = break_deleg_wait(&delegated_inode); if (!error) diff --git a/include/linux/namei.h b/include/linux/namei.h index 217aa6de9f25..b1ea89568de8 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -101,6 +101,15 @@ extern int follow_up(struct path *); extern struct dentry *lock_rename(struct dentry *, struct dentry *); extern void unlock_rename(struct dentry *, struct dentry *); +extern struct dentry *lock_rename_lookup_one( + struct dentry *p1, struct dentry *p2, + struct dentry **d1p, struct dentry **d2p, + const char *name1, int nlen1, + const char *name2, int nlen2, + unsigned int flags1, unsigned int flags2, + wait_queue_head_t *wq); +extern void unlock_rename_lookup(struct dentry *p1, struct dentry *p2, + struct dentry *d1, struct dentry *d2); extern int __must_check nd_jump_link(struct path *path); From patchwork Mon Jun 13 23:18:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12880304 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8034DC433EF for ; Mon, 13 Jun 2022 23:21:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241188AbiFMXVQ (ORCPT ); Mon, 13 Jun 2022 19:21:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52440 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238757AbiFMXVC (ORCPT ); Mon, 13 Jun 2022 19:21:02 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4016F3193F; Mon, 13 Jun 2022 16:20:56 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id F2A431F97B; Mon, 13 Jun 2022 23:20:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1655162455; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Hi+NJU1EkxRpFX0rhWMXFKN8RYkXI07hcWLLJcSj+Iw=; b=Qk8MM4lxyxuscFI/U5jyjURjSgu+7yXFJVTxm/b9ReVhvhBZmz2iewmFxjoVfx91PrN2iB 7MW//rnmUk05e1Ka/VaW10Od8QVd3J6QK+3JZ8C5glTPvti/KXQZelT69OSymITsYqz4XL PTPFUF/WwwJR7xJHZF3Oal/EmxhN3ns= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1655162455; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Hi+NJU1EkxRpFX0rhWMXFKN8RYkXI07hcWLLJcSj+Iw=; b=3yD38L5wibetd93L5oFyPOLph9zHt6MstIvpHDuOGSAKa0kgCCeWPPbn1vWxmBGA9MQa2B W8pR0qZLoA6dcSAg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id DB3AF134CF; Mon, 13 Jun 2022 23:20:52 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id WdEhJlTGp2LobwAAMHmgww (envelope-from ); Mon, 13 Jun 2022 23:20:52 +0000 Subject: [PATCH 07/12] NFS: support parallel updates in the one directory. From: NeilBrown To: Al Viro , Daire Byrne , Trond Myklebust , Chuck Lever Cc: Linux NFS Mailing List , linux-fsdevel@vger.kernel.org, LKML Date: Tue, 14 Jun 2022 09:18:22 +1000 Message-ID: <165516230200.21248.14713533079253477888.stgit@noble.brown> In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown> References: <165516173293.21248.14587048046993234326.stgit@noble.brown> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org NFS can easily support parallel updates as the locking is done on the server, so this patch enables parallel updates for NFS. NFS unlink needs to block concurrent opens() once it decides to actually unlink the file, rather than rename it to .nfsXXXX (aka sillyrename). It currently does this by temporarily unhashing the dentry and relying on the exclusive lock on the directory to block a ->lookup(). That doesn't work now that unlink uses a shared lock, so an alternate approach is needed. __nfs_lookup_revalidate (->d_revalidate) now blocks if DCACHE_PAR_UPDATE is set, and if nfs_unlink() happens to be called with an exclusive lock and DCACHE_PAR_UPDATE is not set, it get set during the potential race window. I'd rather use some other indicator in the dentry to tell _nfs_lookup_revalidate() to wait, but we are nearly out of d_flags bits, and NFS doesn't have a general-purpose d_fsdata. NFS "silly-rename" may now be called with only a shared lock on the directory, so it needs a bit of extra care to get exclusive access to the new name. d_lock_update_nested() and d_unlock_update() help here. Signed-off-by: NeilBrown --- fs/nfs/dir.c | 29 +++++++++++++++++++++++------ fs/nfs/inode.c | 2 ++ fs/nfs/unlink.c | 5 ++++- 3 files changed, 29 insertions(+), 7 deletions(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index a8ecdd527662..54c2c7adcd56 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1778,6 +1778,9 @@ __nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags, int ret; if (flags & LOOKUP_RCU) { + if (dentry->d_flags & DCACHE_PAR_UPDATE) + /* Pending unlink */ + return -ECHILD; parent = READ_ONCE(dentry->d_parent); dir = d_inode_rcu(parent); if (!dir) @@ -1786,6 +1789,9 @@ __nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags, if (parent != READ_ONCE(dentry->d_parent)) return -ECHILD; } else { + /* Wait for unlink to complete */ + wait_var_event(&dentry->d_flags, + !(dentry->d_flags & DCACHE_PAR_UPDATE)); parent = dget_parent(dentry); ret = reval(d_inode(parent), dentry, flags); dput(parent); @@ -2453,7 +2459,7 @@ static int nfs_safe_remove(struct dentry *dentry) int nfs_unlink(struct inode *dir, struct dentry *dentry) { int error; - int need_rehash = 0; + bool did_set_par_update = false; dfprintk(VFS, "NFS: unlink(%s/%lu, %pd)\n", dir->i_sb->s_id, dir->i_ino, dentry); @@ -2468,15 +2474,26 @@ int nfs_unlink(struct inode *dir, struct dentry *dentry) error = nfs_sillyrename(dir, dentry); goto out; } - if (!d_unhashed(dentry)) { - __d_drop(dentry); - need_rehash = 1; + /* We must prevent any concurrent open until the unlink + * completes. ->d_revalidate will wait for DCACHE_PAR_UPDATE + * to clear, but if this happens to a non-parallel update, we + * still want to block opens. So set DCACHE_PAR_UPDATE + * temporarily. + */ + if (!(dentry->d_flags & DCACHE_PAR_UPDATE)) { + /* Must have exclusive lock on parent */ + did_set_par_update = true; + dentry->d_flags |= DCACHE_PAR_UPDATE; } + spin_unlock(&dentry->d_lock); error = nfs_safe_remove(dentry); nfs_dentry_remove_handle_error(dir, dentry, error); - if (need_rehash) - d_rehash(dentry); + if (did_set_par_update) { + spin_lock(&dentry->d_lock); + dentry->d_flags &= ~DCACHE_PAR_UPDATE; + spin_unlock(&dentry->d_lock); + } out: trace_nfs_unlink_exit(dir, dentry, error); return error; diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index b4e46b0ffa2d..cea2554710d2 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -481,6 +481,8 @@ nfs_fhget(struct super_block *sb, struct nfs_fh *fh, struct nfs_fattr *fattr) /* We can't support update_atime(), since the server will reset it */ inode->i_flags |= S_NOATIME|S_NOCMTIME; + /* Parallel updates to directories are trivial */ + inode->i_flags |= S_PAR_UPDATE; inode->i_mode = fattr->mode; nfsi->cache_validity = 0; if ((fattr->valid & NFS_ATTR_FATTR_MODE) == 0 diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c index 9697cd5d2561..52a20eb6131c 100644 --- a/fs/nfs/unlink.c +++ b/fs/nfs/unlink.c @@ -462,6 +462,7 @@ nfs_sillyrename(struct inode *dir, struct dentry *dentry) sdentry = NULL; do { int slen; + d_unlock_update(sdentry); dput(sdentry); sillycounter++; slen = scnprintf(silly, sizeof(silly), @@ -479,7 +480,8 @@ nfs_sillyrename(struct inode *dir, struct dentry *dentry) */ if (IS_ERR(sdentry)) goto out; - } while (d_inode(sdentry) != NULL); /* need negative lookup */ + } while (!d_lock_update_nested(sdentry, NULL, NULL, + SINGLE_DEPTH_NESTING)); ihold(inode); @@ -524,6 +526,7 @@ nfs_sillyrename(struct inode *dir, struct dentry *dentry) rpc_put_task(task); out_dput: iput(inode); + d_unlock_update(sdentry); dput(sdentry); out: return error; From patchwork Mon Jun 13 23:18:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12880305 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CAD22C43334 for ; Mon, 13 Jun 2022 23:21:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343543AbiFMXVi (ORCPT ); Mon, 13 Jun 2022 19:21:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241456AbiFMXV2 (ORCPT ); Mon, 13 Jun 2022 19:21:28 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E8193204E; Mon, 13 Jun 2022 16:21:08 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 9086721AA0; Mon, 13 Jun 2022 23:21:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1655162466; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Zvae5qbFLPE6aGXLD70p+XHvaKYzCxL7NtQu9CXSHSo=; b=JUuLIxo5v6mR2RWqhYFTpBZhAgnDrWra+KfVdWwlemURh81e0yn6RPVKX2JhIja57rUxRG Mhf6Y1TNPdTPexy2gArx8aKoySipDoz+fHrI9nzs/s+EIquFJjgVnyFp6L49M53mRhP4Y+ BPiwpptaDfGE+urYgdb9R1C1yNrj1dM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1655162466; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Zvae5qbFLPE6aGXLD70p+XHvaKYzCxL7NtQu9CXSHSo=; b=Ez1SXRYApDfGu3J1wt02dviUw2QqYbGeP9WfCFQ9irHE2Xi34pPHVDzFmOuZYaANkoV04O fIxPMTN0cwPrrdDA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 4A56B134CF; Mon, 13 Jun 2022 23:21:02 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 3rrSAV7Gp2LwbwAAMHmgww (envelope-from ); Mon, 13 Jun 2022 23:21:02 +0000 Subject: [PATCH 08/12] nfsd: allow parallel creates from nfsd From: NeilBrown To: Al Viro , Daire Byrne , Trond Myklebust , Chuck Lever Cc: Linux NFS Mailing List , linux-fsdevel@vger.kernel.org, LKML Date: Tue, 14 Jun 2022 09:18:22 +1000 Message-ID: <165516230200.21248.15108802355330895562.stgit@noble.brown> In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown> References: <165516173293.21248.14587048046993234326.stgit@noble.brown> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Rather than getting write access, locking the directory, and performing a lookup, use filename_create_one_len() for create, or lookup_hash_update_len() for delete which combines all these steps and handles shared locking for concurrent updates where appropriate. As we don't use fh_lock() we need to call fh_fill_pre_attrs() explicitly. However if we only get a shared lock, then the pre/post attributes won't be atomic, so we need to ensure that fh know that, and doesn't try to encode wcc attrs either. Note that there is only one filesystem that allows shared locks for create/unlink and that is NFS (for re-export). NFS does support atomic pre/post attributes, so there is no loss in not providing them. When some other filesystem supports concurrent updates, we might need to consider if the pre/post attributes are more important than the parallelism. Signed-off-by: NeilBrown Reviewed-by: Chuck Lever --- fs/nfsd/nfs3proc.c | 31 +++++------- fs/nfsd/nfs4proc.c | 32 +++++------- fs/nfsd/nfsfh.c | 7 ++- fs/nfsd/nfsfh.h | 4 +- fs/nfsd/nfsproc.c | 29 +++++------ fs/nfsd/vfs.c | 134 +++++++++++++++++++++++----------------------------- 6 files changed, 105 insertions(+), 132 deletions(-) diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c index 981a3a7a6e16..0fdbb9504a87 100644 --- a/fs/nfsd/nfs3proc.c +++ b/fs/nfsd/nfs3proc.c @@ -231,12 +231,14 @@ static __be32 nfsd3_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp, struct svc_fh *resfhp, struct nfsd3_createargs *argp) { + struct path path; struct iattr *iap = &argp->attrs; - struct dentry *parent, *child; + struct dentry *child; __u32 v_mtime, v_atime; struct inode *inode; __be32 status; int host_err; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); if (isdotent(argp->name, argp->len)) return nfserr_exist; @@ -247,20 +249,15 @@ nfsd3_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp, if (status != nfs_ok) return status; - parent = fhp->fh_dentry; - inode = d_inode(parent); + path.dentry = fhp->fh_dentry; + path.mnt = fhp->fh_export->ex_path.mnt; + inode = d_inode(path.dentry); - host_err = fh_want_write(fhp); - if (host_err) - return nfserrno(host_err); + child = filename_create_one_len(argp->name, argp->len, + &path, 0, &wq); - fh_lock_nested(fhp, I_MUTEX_PARENT); - - child = lookup_one_len(argp->name, parent, argp->len); - if (IS_ERR(child)) { - status = nfserrno(PTR_ERR(child)); - goto out; - } + if (IS_ERR(child)) + return nfserrno(PTR_ERR(child)); if (d_really_is_negative(child)) { status = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE); @@ -311,6 +308,7 @@ nfsd3_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp, if (!IS_POSIXACL(inode)) iap->ia_mode &= ~current_umask(); + fh_fill_pre_attrs(fhp, (child->d_flags & DCACHE_PAR_UPDATE) == 0); host_err = vfs_create(&init_user_ns, inode, child, iap->ia_mode, true); if (host_err < 0) { @@ -332,12 +330,9 @@ nfsd3_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp, set_attr: status = nfsd_create_setattr(rqstp, fhp, resfhp, iap); - + fh_fill_post_attrs(fhp); out: - fh_unlock(fhp); - if (child && !IS_ERR(child)) - dput(child); - fh_drop_write(fhp); + done_path_update(&path, child, &wq); return status; } diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 3895eb52d2b1..71a4b8ef77f0 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -285,12 +285,13 @@ static __be32 nfsd4_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp, struct svc_fh *resfhp, struct nfsd4_open *open) { + struct path path; struct iattr *iap = &open->op_iattr; - struct dentry *parent, *child; + struct dentry *child; __u32 v_mtime, v_atime; struct inode *inode; __be32 status; - int host_err; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); if (isdotent(open->op_fname, open->op_fnamelen)) return nfserr_exist; @@ -300,20 +301,17 @@ nfsd4_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp, status = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC); if (status != nfs_ok) return status; - parent = fhp->fh_dentry; - inode = d_inode(parent); - host_err = fh_want_write(fhp); - if (host_err) - return nfserrno(host_err); + path.dentry = fhp->fh_dentry; + path.mnt = fhp->fh_export->ex_path.mnt; + inode = d_inode(path.dentry); - fh_lock_nested(fhp, I_MUTEX_PARENT); + child = filename_create_one_len(open->op_fname, + open->op_fnamelen, + &path, 0, &wq); - child = lookup_one_len(open->op_fname, parent, open->op_fnamelen); - if (IS_ERR(child)) { - status = nfserrno(PTR_ERR(child)); - goto out; - } + if (IS_ERR(child)) + return nfserrno(PTR_ERR(child)); if (d_really_is_negative(child)) { status = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE); @@ -386,6 +384,7 @@ nfsd4_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp, if (!IS_POSIXACL(inode)) iap->ia_mode &= ~current_umask(); + fh_fill_pre_attrs(fhp, (child->d_flags & DCACHE_PAR_UPDATE) == 0); status = nfsd4_vfs_create(fhp, child, open); if (status != nfs_ok) goto out; @@ -405,12 +404,9 @@ nfsd4_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp, set_attr: status = nfsd_create_setattr(rqstp, fhp, resfhp, iap); - + fh_fill_post_attrs(fhp); out: - fh_unlock(fhp); - if (child && !IS_ERR(child)) - dput(child); - fh_drop_write(fhp); + done_path_update(&path, child, &wq); return status; } diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index c29baa03dfaf..a50db688c60d 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -616,7 +616,7 @@ fh_update(struct svc_fh *fhp) * @fhp: file handle to be updated * */ -void fh_fill_pre_attrs(struct svc_fh *fhp) +void fh_fill_pre_attrs(struct svc_fh *fhp, bool atomic) { bool v4 = (fhp->fh_maxsize == NFS4_FHSIZE); struct inode *inode; @@ -626,6 +626,11 @@ void fh_fill_pre_attrs(struct svc_fh *fhp) if (fhp->fh_no_wcc || fhp->fh_pre_saved) return; + if (!atomic) { + fhp->fh_no_atomic_attr = true; + fhp->fh_no_wcc = true; + } + inode = d_inode(fhp->fh_dentry); err = fh_getattr(fhp, &stat); if (err) { diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h index fb9d358a267e..ecc57fd3fd67 100644 --- a/fs/nfsd/nfsfh.h +++ b/fs/nfsd/nfsfh.h @@ -320,7 +320,7 @@ static inline u64 nfsd4_change_attribute(struct kstat *stat, return time_to_chattr(&stat->ctime); } -extern void fh_fill_pre_attrs(struct svc_fh *fhp); +extern void fh_fill_pre_attrs(struct svc_fh *fhp, bool atomic); extern void fh_fill_post_attrs(struct svc_fh *fhp); @@ -347,7 +347,7 @@ fh_lock_nested(struct svc_fh *fhp, unsigned int subclass) inode = d_inode(dentry); inode_lock_nested(inode, subclass); - fh_fill_pre_attrs(fhp); + fh_fill_pre_attrs(fhp, true); fhp->fh_locked = true; } diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c index fcdab8a8a41f..2dccf77634e8 100644 --- a/fs/nfsd/nfsproc.c +++ b/fs/nfsd/nfsproc.c @@ -255,6 +255,7 @@ nfsd_proc_write(struct svc_rqst *rqstp) static __be32 nfsd_proc_create(struct svc_rqst *rqstp) { + struct path path; struct nfsd_createargs *argp = rqstp->rq_argp; struct nfsd_diropres *resp = rqstp->rq_resp; svc_fh *dirfhp = &argp->fh; @@ -263,8 +264,8 @@ nfsd_proc_create(struct svc_rqst *rqstp) struct inode *inode; struct dentry *dchild; int type, mode; - int hosterr; dev_t rdev = 0, wanted = new_decode_dev(attr->ia_size); + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); dprintk("nfsd: CREATE %s %.*s\n", SVCFH_fmt(dirfhp), argp->len, argp->name); @@ -279,17 +280,13 @@ nfsd_proc_create(struct svc_rqst *rqstp) resp->status = nfserr_exist; if (isdotent(argp->name, argp->len)) goto done; - hosterr = fh_want_write(dirfhp); - if (hosterr) { - resp->status = nfserrno(hosterr); - goto done; - } - fh_lock_nested(dirfhp, I_MUTEX_PARENT); - dchild = lookup_one_len(argp->name, dirfhp->fh_dentry, argp->len); + path.dentry = dirfhp->fh_dentry; + path.mnt = dirfhp->fh_export->ex_path.mnt; + dchild = filename_create_one_len(argp->name, argp->len, &path, 0, &wq); if (IS_ERR(dchild)) { resp->status = nfserrno(PTR_ERR(dchild)); - goto out_unlock; + goto out_done; } fh_init(newfhp, NFS_FHSIZE); resp->status = fh_compose(newfhp, dirfhp->fh_export, dchild, dirfhp); @@ -298,7 +295,7 @@ nfsd_proc_create(struct svc_rqst *rqstp) dput(dchild); if (resp->status) { if (resp->status != nfserr_noent) - goto out_unlock; + goto out_done; /* * If the new file handle wasn't verified, we can't tell * whether the file exists or not. Time to bail ... @@ -307,7 +304,7 @@ nfsd_proc_create(struct svc_rqst *rqstp) if (!newfhp->fh_dentry) { printk(KERN_WARNING "nfsd_proc_create: file handle not verified\n"); - goto out_unlock; + goto out_done; } } @@ -341,7 +338,7 @@ nfsd_proc_create(struct svc_rqst *rqstp) newfhp->fh_dentry, NFSD_MAY_WRITE|NFSD_MAY_LOCAL_ACCESS); if (resp->status && resp->status != nfserr_rofs) - goto out_unlock; + goto out_done; } } else type = S_IFREG; @@ -378,7 +375,7 @@ nfsd_proc_create(struct svc_rqst *rqstp) /* Make sure the type and device matches */ resp->status = nfserr_exist; if (inode && inode_wrong_type(inode, type)) - goto out_unlock; + goto out_done; } resp->status = nfs_ok; @@ -400,10 +397,8 @@ nfsd_proc_create(struct svc_rqst *rqstp) (time64_t)0); } -out_unlock: - /* We don't really need to unlock, as fh_put does it. */ - fh_unlock(dirfhp); - fh_drop_write(dirfhp); +out_done: + done_path_update(&path, dchild, &wq); done: fh_put(dirfhp); if (resp->status != nfs_ok) diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 840e3af63a6f..6cdd5e407600 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1274,12 +1274,6 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, dirp = d_inode(dentry); dchild = dget(resfhp->fh_dentry); - if (!fhp->fh_locked) { - WARN_ONCE(1, "nfsd_create: parent %pd2 not locked!\n", - dentry); - err = nfserr_io; - goto out; - } err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE); if (err) @@ -1362,9 +1356,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp, char *fname, int flen, struct iattr *iap, int type, dev_t rdev, struct svc_fh *resfhp) { - struct dentry *dentry, *dchild = NULL; + struct path path; + struct dentry *dchild = NULL; __be32 err; int host_err; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); if (isdotent(fname, flen)) return nfserr_exist; @@ -1373,27 +1369,22 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp, if (err) return err; - dentry = fhp->fh_dentry; - - host_err = fh_want_write(fhp); - if (host_err) - return nfserrno(host_err); + path.dentry = fhp->fh_dentry; + path.mnt = fhp->fh_export->ex_path.mnt; - fh_lock_nested(fhp, I_MUTEX_PARENT); - dchild = lookup_one_len(fname, dentry, flen); + dchild = filename_create_one_len(fname, flen, &path, 0, &wq); host_err = PTR_ERR(dchild); if (IS_ERR(dchild)) return nfserrno(host_err); err = fh_compose(resfhp, fhp->fh_export, dchild, fhp); - /* - * We unconditionally drop our ref to dchild as fh_compose will have - * already grabbed its own ref for it. - */ - dput(dchild); - if (err) - return err; - return nfsd_create_locked(rqstp, fhp, fname, flen, iap, type, - rdev, resfhp); + if (!err) { + fh_fill_pre_attrs(fhp, (dchild->d_flags & DCACHE_PAR_UPDATE) == 0); + err = nfsd_create_locked(rqstp, fhp, fname, flen, iap, type, + rdev, resfhp); + fh_fill_post_attrs(fhp); + } + done_path_update(&path, dchild, &wq); + return err; } /* @@ -1441,15 +1432,17 @@ nfsd_readlink(struct svc_rqst *rqstp, struct svc_fh *fhp, char *buf, int *lenp) __be32 nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp, char *fname, int flen, - char *path, + char *lpath, struct svc_fh *resfhp) { - struct dentry *dentry, *dnew; + struct path path; + struct dentry *dnew; __be32 err, cerr; int host_err; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); err = nfserr_noent; - if (!flen || path[0] == '\0') + if (!flen || lpath[0] == '\0') goto out; err = nfserr_exist; if (isdotent(fname, flen)) @@ -1459,28 +1452,28 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp, if (err) goto out; - host_err = fh_want_write(fhp); - if (host_err) - goto out_nfserr; + path.dentry = fhp->fh_dentry; + path.mnt = fhp->fh_export->ex_path.mnt; - fh_lock(fhp); - dentry = fhp->fh_dentry; - dnew = lookup_one_len(fname, dentry, flen); + dnew = filename_create_one_len(fname, flen, &path, 0, &wq); host_err = PTR_ERR(dnew); if (IS_ERR(dnew)) goto out_nfserr; - host_err = vfs_symlink(&init_user_ns, d_inode(dentry), dnew, path); + fh_fill_pre_attrs(fhp, (dnew->d_flags & DCACHE_PAR_UPDATE) == 0); + host_err = vfs_symlink(mnt_user_ns(path.mnt), d_inode(path.dentry), + dnew, lpath); err = nfserrno(host_err); - fh_unlock(fhp); if (!err) err = nfserrno(commit_metadata(fhp)); - fh_drop_write(fhp); + fh_fill_post_attrs(fhp); cerr = fh_compose(resfhp, fhp->fh_export, dnew, fhp); - dput(dnew); - if (err==0) err = cerr; + if (err==0) + err = cerr; + + done_path_update(&path, dnew, &wq); out: return err; @@ -1497,10 +1490,12 @@ __be32 nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *name, int len, struct svc_fh *tfhp) { - struct dentry *ddir, *dnew, *dold; + struct path path; + struct dentry *dold, *dnew; struct inode *dirp; __be32 err; int host_err; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_CREATE); if (err) @@ -1518,17 +1513,11 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp, if (isdotent(name, len)) goto out; - host_err = fh_want_write(tfhp); - if (host_err) { - err = nfserrno(host_err); - goto out; - } - - fh_lock_nested(ffhp, I_MUTEX_PARENT); - ddir = ffhp->fh_dentry; - dirp = d_inode(ddir); + path.dentry = ffhp->fh_dentry; + path.mnt = ffhp->fh_export->ex_path.mnt; + dirp = d_inode(path.dentry); - dnew = lookup_one_len(name, ddir, len); + dnew = filename_create_one_len(name, len, &path, 0, &wq); host_err = PTR_ERR(dnew); if (IS_ERR(dnew)) goto out_nfserr; @@ -1537,9 +1526,10 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp, err = nfserr_noent; if (d_really_is_negative(dold)) - goto out_dput; + goto out_done; + fh_fill_pre_attrs(ffhp, (dnew->d_flags & DCACHE_PAR_UPDATE) == 0); host_err = vfs_link(dold, &init_user_ns, dirp, dnew, NULL); - fh_unlock(ffhp); + if (!host_err) { err = nfserrno(commit_metadata(ffhp)); if (!err) @@ -1550,17 +1540,15 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp, else err = nfserrno(host_err); } -out_dput: - dput(dnew); -out_unlock: - fh_unlock(ffhp); - fh_drop_write(tfhp); +out_done: + fh_fill_post_attrs(ffhp); + done_path_update(&path, dnew, &wq); out: return err; out_nfserr: err = nfserrno(host_err); - goto out_unlock; + goto out; } static void @@ -1625,8 +1613,8 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen, * so do it by hand */ trap = lock_rename(tdentry, fdentry); ffhp->fh_locked = tfhp->fh_locked = true; - fh_fill_pre_attrs(ffhp); - fh_fill_pre_attrs(tfhp); + fh_fill_pre_attrs(ffhp, true); + fh_fill_pre_attrs(tfhp, true); odentry = lookup_one_len(fname, fdentry, flen); host_err = PTR_ERR(odentry); @@ -1717,11 +1705,13 @@ __be32 nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, char *fname, int flen) { - struct dentry *dentry, *rdentry; + struct dentry *rdentry; struct inode *dirp; struct inode *rinode; __be32 err; int host_err; + struct path path; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); err = nfserr_acces; if (!flen || isdotent(fname, flen)) @@ -1730,24 +1720,18 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, if (err) goto out; - host_err = fh_want_write(fhp); - if (host_err) - goto out_nfserr; + path.mnt = fhp->fh_export->ex_path.mnt; + path.dentry = fhp->fh_dentry; - fh_lock_nested(fhp, I_MUTEX_PARENT); - dentry = fhp->fh_dentry; - dirp = d_inode(dentry); + rdentry = lookup_hash_update_len(fname, flen, &path, 0, &wq); + dirp = d_inode(path.dentry); - rdentry = lookup_one_len(fname, dentry, flen); host_err = PTR_ERR(rdentry); if (IS_ERR(rdentry)) - goto out_drop_write; + goto out_nfserr; + + fh_fill_pre_attrs(fhp, (rdentry->d_flags & DCACHE_PAR_UPDATE) == 0); - if (d_really_is_negative(rdentry)) { - dput(rdentry); - host_err = -ENOENT; - goto out_drop_write; - } rinode = d_inode(rdentry); ihold(rinode); @@ -1761,15 +1745,13 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, } else { host_err = vfs_rmdir(&init_user_ns, dirp, rdentry); } + fh_fill_post_attrs(fhp); - fh_unlock(fhp); + done_path_update(&path, rdentry, &wq); if (!host_err) host_err = commit_metadata(fhp); - dput(rdentry); iput(rinode); /* truncate the inode here */ -out_drop_write: - fh_drop_write(fhp); out_nfserr: if (host_err == -EBUSY) { /* name is mounted-on. There is no perfect From patchwork Mon Jun 13 23:18:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12880306 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B911C433EF for ; Mon, 13 Jun 2022 23:22:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241052AbiFMXWD (ORCPT ); Mon, 13 Jun 2022 19:22:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344736AbiFMXVv (ORCPT ); Mon, 13 Jun 2022 19:21:51 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA09D2B25E; Mon, 13 Jun 2022 16:21:38 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 9A93321A94; Mon, 13 Jun 2022 23:21:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1655162497; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H2JdOUtYS+FdB9BrIAmOm6nhk+rYQ/dm6rYDIh55AvU=; b=COvRa52h8t3jlyOI2GLKfLw0FRdz9F/uwAnFIDQ2qhjfAZthIOeXjmVE09j9C5Wi8ibCXD UyIqZS1Ljm5/zzGz4ECoZConPG5hPUYybs5qBEnuYFuImz4/nmmx5n6LNRRNt+iR8Ue14A mvV4+DgVxa7fWelX9YlPuatTUXXSJMQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1655162497; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H2JdOUtYS+FdB9BrIAmOm6nhk+rYQ/dm6rYDIh55AvU=; b=8FO1rfDBUEXsdnOBT3BpxLyrw3HAjQeCz6ULNJlPwD6EGxHCnUVkQUcZ9d97e15Gdav3/S jQGZXPeP/bIUMTDw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 624BE134CF; Mon, 13 Jun 2022 23:21:35 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id I5SpB3/Gp2IScAAAMHmgww (envelope-from ); Mon, 13 Jun 2022 23:21:35 +0000 Subject: [PATCH 09/12] nfsd: support concurrent renames. From: NeilBrown To: Al Viro , Daire Byrne , Trond Myklebust , Chuck Lever Cc: Linux NFS Mailing List , linux-fsdevel@vger.kernel.org, LKML Date: Tue, 14 Jun 2022 09:18:22 +1000 Message-ID: <165516230201.21248.13160043266041158437.stgit@noble.brown> In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown> References: <165516173293.21248.14587048046993234326.stgit@noble.brown> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org If the filesystem supports it, renames can now be concurrent with other updates. We use lock_rename_lookup_one() to do the appropriate locking in the right order and to look up the names. Signed-off-by: NeilBrown Reviewed-by: Chuck Lever --- fs/nfsd/vfs.c | 49 +++++++++++++++++++------------------------------ 1 file changed, 19 insertions(+), 30 deletions(-) diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 6cdd5e407600..b0df216ab3e4 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1584,6 +1584,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen, __be32 err; int host_err; bool close_cached = false; + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_REMOVE); if (err) @@ -1611,41 +1612,37 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen, /* cannot use fh_lock as we need deadlock protective ordering * so do it by hand */ - trap = lock_rename(tdentry, fdentry); - ffhp->fh_locked = tfhp->fh_locked = true; - fh_fill_pre_attrs(ffhp, true); - fh_fill_pre_attrs(tfhp, true); - - odentry = lookup_one_len(fname, fdentry, flen); - host_err = PTR_ERR(odentry); - if (IS_ERR(odentry)) + trap = lock_rename_lookup_one(tdentry, fdentry, &ndentry, &odentry, + tname, tlen, fname, flen, 0, 0, &wq); + host_err = PTR_ERR(trap); + if (IS_ERR(trap)) goto out_nfserr; + ffhp->fh_locked = tfhp->fh_locked = true; + fh_fill_pre_attrs(ffhp, (ndentry->d_flags & DCACHE_PAR_UPDATE) == 0); + fh_fill_pre_attrs(tfhp, (ndentry->d_flags & DCACHE_PAR_UPDATE) == 0); host_err = -ENOENT; if (d_really_is_negative(odentry)) - goto out_dput_old; + goto out_unlock; host_err = -EINVAL; if (odentry == trap) - goto out_dput_old; + goto out_unlock; - ndentry = lookup_one_len(tname, tdentry, tlen); - host_err = PTR_ERR(ndentry); - if (IS_ERR(ndentry)) - goto out_dput_old; host_err = -ENOTEMPTY; if (ndentry == trap) - goto out_dput_new; + goto out_unlock; host_err = -EXDEV; if (ffhp->fh_export->ex_path.mnt != tfhp->fh_export->ex_path.mnt) - goto out_dput_new; + goto out_unlock; if (ffhp->fh_export->ex_path.dentry != tfhp->fh_export->ex_path.dentry) - goto out_dput_new; + goto out_unlock; if ((ndentry->d_sb->s_export_op->flags & EXPORT_OP_CLOSE_BEFORE_UNLINK) && nfsd_has_cached_files(ndentry)) { close_cached = true; - goto out_dput_old; + dget(ndentry); + goto out_unlock; } else { struct renamedata rd = { .old_mnt_userns = &init_user_ns, @@ -1662,23 +1659,15 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen, host_err = commit_metadata(ffhp); } } - out_dput_new: - dput(ndentry); - out_dput_old: - dput(odentry); - out_nfserr: - err = nfserrno(host_err); - /* - * We cannot rely on fh_unlock on the two filehandles, - * as that would do the wrong thing if the two directories - * were the same, so again we do it by hand. - */ if (!close_cached) { fh_fill_post_attrs(ffhp); fh_fill_post_attrs(tfhp); } - unlock_rename(tdentry, fdentry); + out_unlock: + unlock_rename_lookup(tdentry, fdentry, ndentry, odentry); ffhp->fh_locked = tfhp->fh_locked = false; + out_nfserr: + err = nfserrno(host_err); fh_drop_write(ffhp); /* From patchwork Mon Jun 13 23:18:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12880307 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0AFBCCCA47B for ; Mon, 13 Jun 2022 23:22:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245082AbiFMXWG (ORCPT ); Mon, 13 Jun 2022 19:22:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241744AbiFMXVx (ORCPT ); Mon, 13 Jun 2022 19:21:53 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 037A53193A; Mon, 13 Jun 2022 16:21:45 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id B801F21A94; Mon, 13 Jun 2022 23:21:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1655162503; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CUF4AhRWCiEgz1MsSrvILiRjzwqzC+l2o2t3OmAkQ6Q=; b=UGQ2Y2JgDGR1dlSMXaTaex3knm9MLPW56eaKkkz9xCh6sbq4Sa0G+5ENyb2xDwWSRd0IEA +4DWhsqqqxyimJ3wJyl2gurDUQmBdPCprXRQ/G7pVeBLirmyI8+hpH/rwQssT2XTB3ftvZ y/C9JHSIASC+jMJ6huxI2bDdgKXB9JM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1655162503; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CUF4AhRWCiEgz1MsSrvILiRjzwqzC+l2o2t3OmAkQ6Q=; b=mD53fllLcILh7klqva97XPMH3l4JpWjPOb7Xne1ms67faTChW6wmn0QSYSDwJtdnrgmEdk IKZc/6Abb+dyVCBA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 7F489134CF; Mon, 13 Jun 2022 23:21:41 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id cWexDoXGp2IYcAAAMHmgww (envelope-from ); Mon, 13 Jun 2022 23:21:41 +0000 Subject: [PATCH 10/12] nfsd: reduce locking in nfsd_lookup() From: NeilBrown To: Al Viro , Daire Byrne , Trond Myklebust , Chuck Lever Cc: Linux NFS Mailing List , linux-fsdevel@vger.kernel.org, LKML Date: Tue, 14 Jun 2022 09:18:22 +1000 Message-ID: <165516230202.21248.2917435222861526857.stgit@noble.brown> In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown> References: <165516173293.21248.14587048046993234326.stgit@noble.brown> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org nfsd_lookup() takes an exclusive lock on the parent inode, but many callers don't want the lock and may not need to lock at all if the result is in the dcache. Also this is the only place where the fh_locked flag is needed, as nfsd_lookup() may or may not return with the inode locked, and the fh_locked flag tells which. Change nfsd_lookup() to be passed a pointer to an int flag. If the pointer is NULL, don't take the lock. If not, record in the int whether the lock is held. Then in that one place that care, pass a pointer to an int, and be sure to unlock if necessary. Signed-off-by: NeilBrown Reviewed-by: Chuck Lever --- fs/nfsd/nfs3proc.c | 2 +- fs/nfsd/nfs4proc.c | 27 +++++++++++++-------------- fs/nfsd/nfsproc.c | 2 +- fs/nfsd/vfs.c | 36 +++++++++++++++++++++++++----------- fs/nfsd/vfs.h | 8 +++++--- 5 files changed, 45 insertions(+), 30 deletions(-) diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c index 0fdbb9504a87..d85b110d58dd 100644 --- a/fs/nfsd/nfs3proc.c +++ b/fs/nfsd/nfs3proc.c @@ -96,7 +96,7 @@ nfsd3_proc_lookup(struct svc_rqst *rqstp) resp->status = nfsd_lookup(rqstp, &resp->dirfh, argp->name, argp->len, - &resp->fh); + &resp->fh, NULL); return rpc_success; } diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 71a4b8ef77f0..79434e29b63f 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -411,7 +411,9 @@ nfsd4_create_file(struct svc_rqst *rqstp, struct svc_fh *fhp, } static __be32 -do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, struct nfsd4_open *open, struct svc_fh **resfh) +do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, + struct nfsd4_open *open, struct svc_fh **resfh, + int *locked) { struct svc_fh *current_fh = &cstate->current_fh; int accmode; @@ -455,14 +457,9 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru open->op_bmval[1] |= (FATTR4_WORD1_TIME_ACCESS | FATTR4_WORD1_TIME_MODIFY); } else - /* - * Note this may exit with the parent still locked. - * We will hold the lock until nfsd4_open's final - * lookup, to prevent renames or unlinks until we've had - * a chance to an acquire a delegation if appropriate. - */ status = nfsd_lookup(rqstp, current_fh, - open->op_fname, open->op_fnamelen, *resfh); + open->op_fname, open->op_fnamelen, *resfh, + locked); if (status) goto out; status = nfsd_check_obj_isreg(*resfh); @@ -537,6 +534,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, struct net *net = SVC_NET(rqstp); struct nfsd_net *nn = net_generic(net, nfsd_net_id); bool reclaim = false; + int locked = 0; dprintk("NFSD: nfsd4_open filename %.*s op_openowner %p\n", (int)open->op_fnamelen, open->op_fname, @@ -598,7 +596,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, switch (open->op_claim_type) { case NFS4_OPEN_CLAIM_DELEGATE_CUR: case NFS4_OPEN_CLAIM_NULL: - status = do_open_lookup(rqstp, cstate, open, &resfh); + status = do_open_lookup(rqstp, cstate, open, &resfh, &locked); if (status) goto out; break; @@ -636,6 +634,8 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, fput(open->op_filp); open->op_filp = NULL; } + if (locked) + inode_unlock(cstate->current_fh.fh_dentry->d_inode); if (resfh && resfh != &cstate->current_fh) { fh_dup2(&cstate->current_fh, resfh); fh_put(resfh); @@ -920,7 +920,7 @@ static __be32 nfsd4_do_lookupp(struct svc_rqst *rqstp, struct svc_fh *fh) return nfserr_noent; } fh_put(&tmp_fh); - return nfsd_lookup(rqstp, fh, "..", 2, fh); + return nfsd_lookup(rqstp, fh, "..", 2, fh, NULL); } static __be32 @@ -936,7 +936,7 @@ nfsd4_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, { return nfsd_lookup(rqstp, &cstate->current_fh, u->lookup.lo_name, u->lookup.lo_len, - &cstate->current_fh); + &cstate->current_fh, NULL); } static __be32 @@ -1078,11 +1078,10 @@ nfsd4_secinfo(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, if (err) return err; err = nfsd_lookup_dentry(rqstp, &cstate->current_fh, - secinfo->si_name, secinfo->si_namelen, - &exp, &dentry); + secinfo->si_name, secinfo->si_namelen, + &exp, &dentry, NULL); if (err) return err; - fh_unlock(&cstate->current_fh); if (d_really_is_negative(dentry)) { exp_put(exp); err = nfserr_noent; diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c index 2dccf77634e8..465d70e053f6 100644 --- a/fs/nfsd/nfsproc.c +++ b/fs/nfsd/nfsproc.c @@ -133,7 +133,7 @@ nfsd_proc_lookup(struct svc_rqst *rqstp) fh_init(&resp->fh, NFS_FHSIZE); resp->status = nfsd_lookup(rqstp, &argp->fh, argp->name, argp->len, - &resp->fh); + &resp->fh, NULL); fh_put(&argp->fh); if (resp->status != nfs_ok) goto out; diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index b0df216ab3e4..4c2e431100ba 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -172,7 +172,8 @@ int nfsd_mountpoint(struct dentry *dentry, struct svc_export *exp) __be32 nfsd_lookup_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp, const char *name, unsigned int len, - struct svc_export **exp_ret, struct dentry **dentry_ret) + struct svc_export **exp_ret, struct dentry **dentry_ret, + int *locked) { struct svc_export *exp; struct dentry *dparent; @@ -184,6 +185,9 @@ nfsd_lookup_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp, dparent = fhp->fh_dentry; exp = exp_get(fhp->fh_export); + if (locked) + *locked = 0; + /* Lookup the name, but don't follow links */ if (isdotent(name, len)) { if (len==1) @@ -199,13 +203,15 @@ nfsd_lookup_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp, goto out_nfserr; } } else { - /* - * In the nfsd4_open() case, this may be held across - * subsequent open and delegation acquisition which may - * need to take the child's i_mutex: - */ - fh_lock_nested(fhp, I_MUTEX_PARENT); - dentry = lookup_one_len(name, dparent, len); + if (locked) { + inode_lock_nested(dparent->d_inode, I_MUTEX_PARENT); + dentry = lookup_one_len(name, dparent, len); + if (IS_ERR(dentry)) + inode_unlock(dparent->d_inode); + else + *locked = 1; + } else + dentry = lookup_one_len_unlocked(name, dparent, len); host_err = PTR_ERR(dentry); if (IS_ERR(dentry)) goto out_nfserr; @@ -218,7 +224,10 @@ nfsd_lookup_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp, * something that we might be about to delegate, * and a mountpoint won't be renamed: */ - fh_unlock(fhp); + if (locked && *locked) { + inode_unlock(dparent->d_inode); + *locked = 0; + } if ((host_err = nfsd_cross_mnt(rqstp, &dentry, &exp))) { dput(dentry); goto out_nfserr; @@ -248,7 +257,8 @@ nfsd_lookup_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp, */ __be32 nfsd_lookup(struct svc_rqst *rqstp, struct svc_fh *fhp, const char *name, - unsigned int len, struct svc_fh *resfh) + unsigned int len, struct svc_fh *resfh, + int *locked) { struct svc_export *exp; struct dentry *dentry; @@ -257,7 +267,7 @@ nfsd_lookup(struct svc_rqst *rqstp, struct svc_fh *fhp, const char *name, err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC); if (err) return err; - err = nfsd_lookup_dentry(rqstp, fhp, name, len, &exp, &dentry); + err = nfsd_lookup_dentry(rqstp, fhp, name, len, &exp, &dentry, locked); if (err) return err; err = check_nfsd_access(exp, rqstp); @@ -273,6 +283,10 @@ nfsd_lookup(struct svc_rqst *rqstp, struct svc_fh *fhp, const char *name, out: dput(dentry); exp_put(exp); + if (err && locked && *locked) { + inode_unlock(fhp->fh_dentry->d_inode); + *locked = 0; + } return err; } diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h index 26347d76f44a..b7d41b73dd79 100644 --- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -45,10 +45,12 @@ typedef int (*nfsd_filldir_t)(void *, const char *, int, loff_t, u64, unsigned); int nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp, struct svc_export **expp); __be32 nfsd_lookup(struct svc_rqst *, struct svc_fh *, - const char *, unsigned int, struct svc_fh *); + const char *, unsigned int, struct svc_fh *, + int *); __be32 nfsd_lookup_dentry(struct svc_rqst *, struct svc_fh *, - const char *, unsigned int, - struct svc_export **, struct dentry **); + const char *, unsigned int, + struct svc_export **, struct dentry **, + int *); __be32 nfsd_setattr(struct svc_rqst *, struct svc_fh *, struct iattr *, int, time64_t); int nfsd_mountpoint(struct dentry *, struct svc_export *); From patchwork Mon Jun 13 23:18:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12880309 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2798CCA47B for ; Mon, 13 Jun 2022 23:22:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245591AbiFMXWL (ORCPT ); Mon, 13 Jun 2022 19:22:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345914AbiFMXV4 (ORCPT ); Mon, 13 Jun 2022 19:21:56 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E996232057; Mon, 13 Jun 2022 16:21:50 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id A8A501F97B; Mon, 13 Jun 2022 23:21:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1655162509; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oMccZnZTMuPJ2V0RFKCrt4ZJ93PoZsHv4InGH+qJUso=; b=VKjD9dl4WKgqy+qwHUPJUMCOG9EgkISb94nr5isNlb2KKHDYAApSRs45O0QMJlCzhljIwt YOFaDD+TX6RDIzFuHzxDrDU3jR0O6wfGPNRZJVAtdQflcXG8CpcoMzWE5Afc94SbmGLVUC H+n9Ua/4mh+pUfM/w/YWwGM9199QdqU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1655162509; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oMccZnZTMuPJ2V0RFKCrt4ZJ93PoZsHv4InGH+qJUso=; b=aDzlr0KMDwebjinIlh9/ilLCiOPvscFz1hfoFYprQI0uiZ5BZkXiJd9H0tmRrJRPmDWBDF zuBKCbM6O8KuQQAw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 6C489134CF; Mon, 13 Jun 2022 23:21:47 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id c2sGCovGp2IbcAAAMHmgww (envelope-from ); Mon, 13 Jun 2022 23:21:47 +0000 Subject: [PATCH 11/12] nfsd: use (un)lock_inode instead of fh_(un)lock From: NeilBrown To: Al Viro , Daire Byrne , Trond Myklebust , Chuck Lever Cc: Linux NFS Mailing List , linux-fsdevel@vger.kernel.org, LKML Date: Tue, 14 Jun 2022 09:18:22 +1000 Message-ID: <165516230203.21248.2885738961424931868.stgit@noble.brown> In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown> References: <165516173293.21248.14587048046993234326.stgit@noble.brown> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Except for the xattr changes, these callers don't need to save pre/post attrs, so use a simple lock/unlock. For the xattr changes, make the saving of pre/post explicit rather than requiring a comment. Also many fh_unlock() are not needed. Signed-off-by: NeilBrown --- fs/nfsd/nfs2acl.c | 6 +++--- fs/nfsd/nfs3acl.c | 4 ++-- fs/nfsd/nfs3proc.c | 4 ---- fs/nfsd/nfs4acl.c | 7 +++---- fs/nfsd/nfs4proc.c | 2 -- fs/nfsd/nfs4state.c | 8 ++++---- fs/nfsd/nfsfh.c | 1 - fs/nfsd/nfsfh.h | 8 -------- fs/nfsd/vfs.c | 26 ++++++++++++-------------- 9 files changed, 24 insertions(+), 42 deletions(-) diff --git a/fs/nfsd/nfs2acl.c b/fs/nfsd/nfs2acl.c index b5760801d377..9edd3c1a30fb 100644 --- a/fs/nfsd/nfs2acl.c +++ b/fs/nfsd/nfs2acl.c @@ -111,7 +111,7 @@ static __be32 nfsacld_proc_setacl(struct svc_rqst *rqstp) if (error) goto out_errno; - fh_lock(fh); + inode_lock(inode); error = set_posix_acl(&init_user_ns, inode, ACL_TYPE_ACCESS, argp->acl_access); @@ -122,7 +122,7 @@ static __be32 nfsacld_proc_setacl(struct svc_rqst *rqstp) if (error) goto out_drop_lock; - fh_unlock(fh); + inode_unlock(inode); fh_drop_write(fh); @@ -136,7 +136,7 @@ static __be32 nfsacld_proc_setacl(struct svc_rqst *rqstp) return rpc_success; out_drop_lock: - fh_unlock(fh); + inode_unlock(inode); fh_drop_write(fh); out_errno: resp->status = nfserrno(error); diff --git a/fs/nfsd/nfs3acl.c b/fs/nfsd/nfs3acl.c index 35b2ebda14da..9446c6743664 100644 --- a/fs/nfsd/nfs3acl.c +++ b/fs/nfsd/nfs3acl.c @@ -101,7 +101,7 @@ static __be32 nfsd3_proc_setacl(struct svc_rqst *rqstp) if (error) goto out_errno; - fh_lock(fh); + inode_lock(inode); error = set_posix_acl(&init_user_ns, inode, ACL_TYPE_ACCESS, argp->acl_access); @@ -111,7 +111,7 @@ static __be32 nfsd3_proc_setacl(struct svc_rqst *rqstp) argp->acl_default); out_drop_lock: - fh_unlock(fh); + inode_unlock(inode); fh_drop_write(fh); out_errno: resp->status = nfserrno(error); diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c index d85b110d58dd..7bb07c7e6ee8 100644 --- a/fs/nfsd/nfs3proc.c +++ b/fs/nfsd/nfs3proc.c @@ -374,7 +374,6 @@ nfsd3_proc_mkdir(struct svc_rqst *rqstp) fh_init(&resp->fh, NFS3_FHSIZE); resp->status = nfsd_create(rqstp, &resp->dirfh, argp->name, argp->len, &argp->attrs, S_IFDIR, 0, &resp->fh); - fh_unlock(&resp->dirfh); return rpc_success; } @@ -449,7 +448,6 @@ nfsd3_proc_mknod(struct svc_rqst *rqstp) type = nfs3_ftypes[argp->ftype]; resp->status = nfsd_create(rqstp, &resp->dirfh, argp->name, argp->len, &argp->attrs, type, rdev, &resp->fh); - fh_unlock(&resp->dirfh); out: return rpc_success; } @@ -472,7 +470,6 @@ nfsd3_proc_remove(struct svc_rqst *rqstp) fh_copy(&resp->fh, &argp->fh); resp->status = nfsd_unlink(rqstp, &resp->fh, -S_IFDIR, argp->name, argp->len); - fh_unlock(&resp->fh); return rpc_success; } @@ -493,7 +490,6 @@ nfsd3_proc_rmdir(struct svc_rqst *rqstp) fh_copy(&resp->fh, &argp->fh); resp->status = nfsd_unlink(rqstp, &resp->fh, S_IFDIR, argp->name, argp->len); - fh_unlock(&resp->fh); return rpc_success; } diff --git a/fs/nfsd/nfs4acl.c b/fs/nfsd/nfs4acl.c index eaa3a0cf38f1..7bcc6dc0f762 100644 --- a/fs/nfsd/nfs4acl.c +++ b/fs/nfsd/nfs4acl.c @@ -779,19 +779,18 @@ nfsd4_set_nfs4_acl(struct svc_rqst *rqstp, struct svc_fh *fhp, if (host_error < 0) goto out_nfserr; - fh_lock(fhp); + inode_lock(inode); host_error = set_posix_acl(&init_user_ns, inode, ACL_TYPE_ACCESS, pacl); if (host_error < 0) goto out_drop_lock; - if (S_ISDIR(inode->i_mode)) { + if (S_ISDIR(inode->i_mode)) host_error = set_posix_acl(&init_user_ns, inode, ACL_TYPE_DEFAULT, dpacl); - } out_drop_lock: - fh_unlock(fhp); + inode_unlock(inode); posix_acl_release(pacl); posix_acl_release(dpacl); diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 79434e29b63f..d6defaf5a77a 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -860,7 +860,6 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, do_set_nfs4_acl(rqstp, &resfh, create->cr_acl, create->cr_bmval); - fh_unlock(&cstate->current_fh); set_change_info(&create->cr_cinfo, &cstate->current_fh); fh_dup2(&cstate->current_fh, &resfh); out: @@ -1040,7 +1039,6 @@ nfsd4_remove(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, status = nfsd_unlink(rqstp, &cstate->current_fh, 0, remove->rm_name, remove->rm_namelen); if (!status) { - fh_unlock(&cstate->current_fh); set_change_info(&remove->rm_cinfo, &cstate->current_fh); } return status; diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 9409a0dc1b76..cb664c61b546 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -7321,21 +7321,21 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, static __be32 nfsd_test_lock(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file_lock *lock) { struct nfsd_file *nf; + struct inode *inode = fhp->fh_dentry->d_inode; __be32 err; err = nfsd_file_acquire(rqstp, fhp, NFSD_MAY_READ, &nf); if (err) return err; - fh_lock(fhp); /* to block new leases till after test_lock: */ - err = nfserrno(nfsd_open_break_lease(fhp->fh_dentry->d_inode, - NFSD_MAY_READ)); + inode_lock(inode); /* to block new leases till after test_lock: */ + err = nfserrno(nfsd_open_break_lease(inode, NFSD_MAY_READ)); if (err) goto out; lock->fl_file = nf->nf_file; err = nfserrno(vfs_test_lock(nf->nf_file, lock)); lock->fl_file = NULL; out: - fh_unlock(fhp); + inode_unlock(inode); nfsd_file_put(nf); return err; } diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index a50db688c60d..ae270e4f921f 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -685,7 +685,6 @@ fh_put(struct svc_fh *fhp) struct dentry * dentry = fhp->fh_dentry; struct svc_export * exp = fhp->fh_export; if (dentry) { - fh_unlock(fhp); fhp->fh_dentry = NULL; dput(dentry); fh_clear_pre_post_attrs(fhp); diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h index ecc57fd3fd67..c5061cdb1016 100644 --- a/fs/nfsd/nfsfh.h +++ b/fs/nfsd/nfsfh.h @@ -323,14 +323,6 @@ static inline u64 nfsd4_change_attribute(struct kstat *stat, extern void fh_fill_pre_attrs(struct svc_fh *fhp, bool atomic); extern void fh_fill_post_attrs(struct svc_fh *fhp); - -/* - * Lock a file handle/inode - * NOTE: both fh_lock and fh_unlock are done "by hand" in - * vfs.c:nfsd_rename as it needs to grab 2 i_mutex's at once - * so, any changes here should be reflected there. - */ - static inline void fh_lock_nested(struct svc_fh *fhp, unsigned int subclass) { diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 4c2e431100ba..f2f4868618bb 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -253,7 +253,6 @@ nfsd_lookup_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp, * returned. Otherwise the covered directory is returned. * NOTE: this mountpoint crossing is not supported properly by all * clients and is explicitly disallowed for NFSv3 - * NeilBrown */ __be32 nfsd_lookup(struct svc_rqst *rqstp, struct svc_fh *fhp, const char *name, @@ -434,7 +433,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, return err; } - fh_lock(fhp); + inode_lock(inode); if (size_change) { /* * RFC5661, Section 18.30.4: @@ -470,7 +469,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, host_err = notify_change(&init_user_ns, dentry, iap, NULL); out_unlock: - fh_unlock(fhp); + inode_unlock(inode); if (size_change) put_write_access(inode); out: @@ -2123,12 +2122,8 @@ nfsd_listxattr(struct svc_rqst *rqstp, struct svc_fh *fhp, char **bufp, } /* - * Removexattr and setxattr need to call fh_lock to both lock the inode - * and set the change attribute. Since the top-level vfs_removexattr - * and vfs_setxattr calls already do their own inode_lock calls, call - * the _locked variant. Pass in a NULL pointer for delegated_inode, - * and let the client deal with NFS4ERR_DELAY (same as with e.g. - * setattr and remove). + * Pass in a NULL pointer for delegated_inode, and let the client deal + * with NFS4ERR_DELAY (same as with e.g. setattr and remove). */ __be32 nfsd_removexattr(struct svc_rqst *rqstp, struct svc_fh *fhp, char *name) @@ -2144,12 +2139,14 @@ nfsd_removexattr(struct svc_rqst *rqstp, struct svc_fh *fhp, char *name) if (ret) return nfserrno(ret); - fh_lock(fhp); + inode_lock(fhp->fh_dentry->d_inode); + fh_fill_pre_attrs(fhp, true); ret = __vfs_removexattr_locked(&init_user_ns, fhp->fh_dentry, name, NULL); - fh_unlock(fhp); + fh_fill_post_attrs(fhp); + inode_unlock(fhp->fh_dentry->d_inode); fh_drop_write(fhp); return nfsd_xattr_errno(ret); @@ -2169,12 +2166,13 @@ nfsd_setxattr(struct svc_rqst *rqstp, struct svc_fh *fhp, char *name, ret = fh_want_write(fhp); if (ret) return nfserrno(ret); - fh_lock(fhp); + inode_lock(fhp->fh_dentry->d_inode); + fh_fill_pre_attrs(fhp, true); ret = __vfs_setxattr_locked(&init_user_ns, fhp->fh_dentry, name, buf, len, flags, NULL); - - fh_unlock(fhp); + fh_fill_post_attrs(fhp); + inode_unlock(fhp->fh_dentry->d_inode); fh_drop_write(fhp); return nfsd_xattr_errno(ret); From patchwork Mon Jun 13 23:18:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12880308 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31716C43334 for ; Mon, 13 Jun 2022 23:22:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244914AbiFMXWJ (ORCPT ); Mon, 13 Jun 2022 19:22:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52440 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229808AbiFMXV5 (ORCPT ); Mon, 13 Jun 2022 19:21:57 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2B8531934; Mon, 13 Jun 2022 16:21:56 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 5F64A21A94; Mon, 13 Jun 2022 23:21:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1655162515; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QYs285JaZzfqdkUZfAfHwZpMuNRbKqBl3ciDgYuYxqA=; b=et/ijtr4zzBtQZn1hpaR6U+mm7mUKnkcxTlgtxuIQ0Z4Qv48UR3hXQwkv6aC7j+52+0kCW HlfWzUYjkombYWtdjMcbXd5p6cSWMJ+9KffQFAteJqUjmQf2ld6D/Q+r5FxclXYtmEs4nU Y/vQ2fNm0xc8Be0tEzKl1B+Bqkfqckg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1655162515; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QYs285JaZzfqdkUZfAfHwZpMuNRbKqBl3ciDgYuYxqA=; b=TxYsvrvWSN89CQtbx6XuyaMg6ZA3tcc/EDJkxSydFMBqczlr+4Woyl6sXnE3CDfqY5N3v3 0+CftBQ93RtT6dAA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 27176134CF; Mon, 13 Jun 2022 23:21:52 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id JbXBNJDGp2IqcAAAMHmgww (envelope-from ); Mon, 13 Jun 2022 23:21:52 +0000 Subject: [PATCH 12/12] nfsd: discard fh_locked flag and fh_lock/fh_unlock From: NeilBrown To: Al Viro , Daire Byrne , Trond Myklebust , Chuck Lever Cc: Linux NFS Mailing List , linux-fsdevel@vger.kernel.org, LKML Date: Tue, 14 Jun 2022 09:18:22 +1000 Message-ID: <165516230204.21248.4630581281540290265.stgit@noble.brown> In-Reply-To: <165516173293.21248.14587048046993234326.stgit@noble.brown> References: <165516173293.21248.14587048046993234326.stgit@noble.brown> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org fh_lock() and fh_unlock() are no longer used, so discard them. They are the only real users of ->fh_locked, so discard that too. Signed-off-by: NeilBrown Reviewed-by: Chuck Lever --- fs/nfsd/nfsfh.c | 2 +- fs/nfsd/nfsfh.h | 48 ++++-------------------------------------------- fs/nfsd/vfs.c | 4 ---- 3 files changed, 5 insertions(+), 49 deletions(-) diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index ae270e4f921f..a3dbe9f34c0e 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -548,7 +548,7 @@ fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct dentry *dentry, if (ref_fh == fhp) fh_put(ref_fh); - if (fhp->fh_locked || fhp->fh_dentry) { + if (fhp->fh_dentry) { printk(KERN_ERR "fh_compose: fh %pd2 not initialized!\n", dentry); } diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h index c5061cdb1016..559912b1d794 100644 --- a/fs/nfsd/nfsfh.h +++ b/fs/nfsd/nfsfh.h @@ -81,7 +81,6 @@ typedef struct svc_fh { struct dentry * fh_dentry; /* validated dentry */ struct svc_export * fh_export; /* export pointer */ - bool fh_locked; /* inode locked by us */ bool fh_want_write; /* remount protection taken */ bool fh_no_wcc; /* no wcc data needed */ bool fh_no_atomic_attr; @@ -93,7 +92,7 @@ typedef struct svc_fh { bool fh_post_saved; /* post-op attrs saved */ bool fh_pre_saved; /* pre-op attrs saved */ - /* Pre-op attributes saved during fh_lock */ + /* Pre-op attributes saved when inode exclusively locked */ __u64 fh_pre_size; /* size before operation */ struct timespec64 fh_pre_mtime; /* mtime before oper */ struct timespec64 fh_pre_ctime; /* ctime before oper */ @@ -103,7 +102,7 @@ typedef struct svc_fh { */ u64 fh_pre_change; - /* Post-op attributes saved in fh_unlock */ + /* Post-op attributes saved in fh_fill_post_attrs() */ struct kstat fh_post_attr; /* full attrs after operation */ u64 fh_post_change; /* nfsv4 change; see above */ } svc_fh; @@ -223,8 +222,8 @@ void fh_put(struct svc_fh *); static __inline__ struct svc_fh * fh_copy(struct svc_fh *dst, struct svc_fh *src) { - WARN_ON(src->fh_dentry || src->fh_locked); - + WARN_ON(src->fh_dentry); + *dst = *src; return dst; } @@ -323,43 +322,4 @@ static inline u64 nfsd4_change_attribute(struct kstat *stat, extern void fh_fill_pre_attrs(struct svc_fh *fhp, bool atomic); extern void fh_fill_post_attrs(struct svc_fh *fhp); -static inline void -fh_lock_nested(struct svc_fh *fhp, unsigned int subclass) -{ - struct dentry *dentry = fhp->fh_dentry; - struct inode *inode; - - BUG_ON(!dentry); - - if (fhp->fh_locked) { - printk(KERN_WARNING "fh_lock: %pd2 already locked!\n", - dentry); - return; - } - - inode = d_inode(dentry); - inode_lock_nested(inode, subclass); - fh_fill_pre_attrs(fhp, true); - fhp->fh_locked = true; -} - -static inline void -fh_lock(struct svc_fh *fhp) -{ - fh_lock_nested(fhp, I_MUTEX_NORMAL); -} - -/* - * Unlock a file handle/inode - */ -static inline void -fh_unlock(struct svc_fh *fhp) -{ - if (fhp->fh_locked) { - fh_fill_post_attrs(fhp); - inode_unlock(d_inode(fhp->fh_dentry)); - fhp->fh_locked = false; - } -} - #endif /* _LINUX_NFSD_NFSFH_H */ diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index f2f4868618bb..0e07b19a0289 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1623,14 +1623,11 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen, goto out; } - /* cannot use fh_lock as we need deadlock protective ordering - * so do it by hand */ trap = lock_rename_lookup_one(tdentry, fdentry, &ndentry, &odentry, tname, tlen, fname, flen, 0, 0, &wq); host_err = PTR_ERR(trap); if (IS_ERR(trap)) goto out_nfserr; - ffhp->fh_locked = tfhp->fh_locked = true; fh_fill_pre_attrs(ffhp, (ndentry->d_flags & DCACHE_PAR_UPDATE) == 0); fh_fill_pre_attrs(tfhp, (ndentry->d_flags & DCACHE_PAR_UPDATE) == 0); @@ -1678,7 +1675,6 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen, } out_unlock: unlock_rename_lookup(tdentry, fdentry, ndentry, odentry); - ffhp->fh_locked = tfhp->fh_locked = false; out_nfserr: err = nfserrno(host_err); fh_drop_write(ffhp);