From patchwork Thu Feb 6 05:42:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962211 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D92B11F60A; Thu, 6 Feb 2025 05:45:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820732; cv=none; b=h/T5ifnH5W4X+sfVknY6XB+P33FJtEF9QAoAn1zbLwBYCDNKtWfqN1gHL/mSOWNOeFstmnne6xUZ3NAwSuS89O51SmhX3Yu4DMlumnhAnyJl1Y2Itg1OHeS6yCZDMbNLK2K13DkZWsz8IBFbeDrvdotLHD9qG4GD8nGGhX67ONY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820732; c=relaxed/simple; bh=kU7g4vLVsL6JRWzwX+o6lQFbCH2Rq/pjuBZlXLYRV6Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MKnJ8LZjpo3RKYQarf+vVTJFozzNgIGQUVvQmdBMrYCAPLqnXittip2mjSrRp9NmA0JSrn0Jmso18w1rybC7vlp9nKYnuq0HOIaf6wO25Qpaf6L4pDICNXjK53VqymJTrya2vr9OTW1kC1Hf/cdZoEN3JRZ47b836kakVxKZ8I0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=1/VoARoj; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=Tl+P9xHT; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=1/VoARoj; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=Tl+P9xHT; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="1/VoARoj"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="Tl+P9xHT"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="1/VoARoj"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="Tl+P9xHT" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 176F521161; Thu, 6 Feb 2025 05:45:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820728; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GAq315uMs7mYDdIfEC74sKx/3hdP2bf4BqcxNrDGOuA=; b=1/VoARojF/fO8rTnWrpcieGYmop/z9zc4DNdvf4oMx6AC+QGhPvzfkTjAJ4M4DGdt4D8Fe iMjkpL2l9vbxvJZw5KFFFW3GTLP8BRKtuk5wYI4O6y36RF/RI2dSfbUKsla5VW40myXT1L dT8+Ri2mN7zom5mzLIkuuRVZWU5RbiY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820728; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GAq315uMs7mYDdIfEC74sKx/3hdP2bf4BqcxNrDGOuA=; b=Tl+P9xHTn7eOgEV1nXs91JDYpsl82w8Saynpb68phXU2Nu/cqMBbxJplKV5Wi+p3AeY7gi afhZukpSFwNYwnDA== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b="1/VoARoj"; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=Tl+P9xHT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820728; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GAq315uMs7mYDdIfEC74sKx/3hdP2bf4BqcxNrDGOuA=; b=1/VoARojF/fO8rTnWrpcieGYmop/z9zc4DNdvf4oMx6AC+QGhPvzfkTjAJ4M4DGdt4D8Fe iMjkpL2l9vbxvJZw5KFFFW3GTLP8BRKtuk5wYI4O6y36RF/RI2dSfbUKsla5VW40myXT1L dT8+Ri2mN7zom5mzLIkuuRVZWU5RbiY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820728; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GAq315uMs7mYDdIfEC74sKx/3hdP2bf4BqcxNrDGOuA=; b=Tl+P9xHTn7eOgEV1nXs91JDYpsl82w8Saynpb68phXU2Nu/cqMBbxJplKV5Wi+p3AeY7gi afhZukpSFwNYwnDA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 5474013795; Thu, 6 Feb 2025 05:45:25 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id e8KdAnVMpGczBwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:45:25 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 01/19] VFS: introduce vfs_mkdir_return() Date: Thu, 6 Feb 2025 16:42:38 +1100 Message-ID: <20250206054504.2950516-2-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 176F521161 X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,suse.de:mid,suse.de:email]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Level: vfs_mkdir() does not guarantee to make the child dentry positive on success. It may leave it negative and then the caller needs to perform a lookup to find the target dentry. This patch introduced vfs_mkdir_return() which performs the lookup if needed so that this code is centralised. This prepares for a new inode operation which will perform mkdir and returns the correct dentry. Signed-off-by: NeilBrown --- fs/cachefiles/namei.c | 7 +--- fs/namei.c | 69 ++++++++++++++++++++++++++++++++++++++++ fs/nfsd/vfs.c | 21 ++---------- fs/overlayfs/dir.c | 33 +------------------ fs/overlayfs/overlayfs.h | 10 +++--- fs/overlayfs/super.c | 2 +- fs/smb/server/vfs.c | 24 +++----------- include/linux/fs.h | 2 ++ 8 files changed, 86 insertions(+), 82 deletions(-) diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index 7cf59713f0f7..3c866c3b9534 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -95,7 +95,6 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache, /* search the current directory for the element name */ inode_lock_nested(d_inode(dir), I_MUTEX_PARENT); -retry: ret = cachefiles_inject_read_error(); if (ret == 0) subdir = lookup_one_len(dirname, dir, strlen(dirname)); @@ -130,7 +129,7 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache, goto mkdir_error; ret = cachefiles_inject_write_error(); if (ret == 0) - ret = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700); + ret = vfs_mkdir_return(&nop_mnt_idmap, d_inode(dir), &subdir, 0700); if (ret < 0) { trace_cachefiles_vfs_error(NULL, d_inode(dir), ret, cachefiles_trace_mkdir_error); @@ -138,10 +137,6 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache, } trace_cachefiles_mkdir(dir, subdir); - if (unlikely(d_unhashed(subdir))) { - cachefiles_put_directory(subdir); - goto retry; - } ASSERT(d_backing_inode(subdir)); _debug("mkdir -> %pd{ino=%lu}", diff --git a/fs/namei.c b/fs/namei.c index 3ab9440c5b93..d98caf36e867 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -4317,6 +4317,75 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, } EXPORT_SYMBOL(vfs_mkdir); +/** + * vfs_mkdir_return - create directory returning correct dentry + * @idmap: idmap of the mount the inode was found from + * @dir: inode of the parent directory + * @dentryp: pointer to dentry of the child directory + * @mode: mode of the child directory + * + * Create a directory. + * + * If the inode has been found through an idmapped mount the idmap of + * the vfsmount must be passed through @idmap. This function will then take + * care to map the inode according to @idmap before checking permissions. + * On non-idmapped mounts or if permission checking is to be performed on the + * raw inode simply pass @nop_mnt_idmap. + * + * The filesystem may not use the dentry that was passed in. In that case + * the passed-in dentry is put and a new one is placed in *@dentryp; + * So on successful return *@dentryp will always be positive. + */ +int vfs_mkdir_return(struct mnt_idmap *idmap, struct inode *dir, + struct dentry **dentryp, umode_t mode) +{ + struct dentry *dentry = *dentryp; + int error; + unsigned max_links = dir->i_sb->s_max_links; + + error = may_create(idmap, dir, dentry); + if (error) + return error; + + if (!dir->i_op->mkdir) + return -EPERM; + + mode = vfs_prepare_mode(idmap, dir, mode, S_IRWXUGO | S_ISVTX, 0); + error = security_inode_mkdir(dir, dentry, mode); + if (error) + return error; + + if (max_links && dir->i_nlink >= max_links) + return -EMLINK; + + error = dir->i_op->mkdir(idmap, dir, dentry, mode); + if (!error) { + fsnotify_mkdir(dir, dentry); + if (unlikely(d_unhashed(dentry))) { + struct dentry *d; + /* Need a "const" pointer. We know d_name is const + * because we hold an exclusive lock on i_rwsem + * in d_parent. + */ + const struct qstr *d_name = (void*)&dentry->d_name; + d = lookup_dcache(d_name, dentry->d_parent, 0); + if (!d) + d = __lookup_slow(d_name, dentry->d_parent, 0); + if (IS_ERR(d)) { + error = PTR_ERR(d); + } else if (unlikely(d_is_negative(d))) { + dput(d); + error = -ENOENT; + } else { + dput(dentry); + *dentryp = d; + } + } + } + return error; +} +EXPORT_SYMBOL(vfs_mkdir_return); + int do_mkdirat(int dfd, struct filename *name, umode_t mode) { struct dentry *dentry; diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 29cb7b812d71..740332413138 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1488,26 +1488,11 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, nfsd_check_ignore_resizing(iap); break; case S_IFDIR: - host_err = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode); - if (!host_err && unlikely(d_unhashed(dchild))) { - struct dentry *d; - d = lookup_one_len(dchild->d_name.name, - dchild->d_parent, - dchild->d_name.len); - if (IS_ERR(d)) { - host_err = PTR_ERR(d); - break; - } - if (unlikely(d_is_negative(d))) { - dput(d); - err = nfserr_serverfault; - goto out; - } + host_err = vfs_mkdir_return(&nop_mnt_idmap, dirp, &dchild, iap->ia_mode); + if (!host_err && unlikely(dchild != resfhp->fh_dentry)) { dput(resfhp->fh_dentry); - resfhp->fh_dentry = dget(d); + resfhp->fh_dentry = dget(dchild); err = fh_update(resfhp); - dput(dchild); - dchild = d; if (err) goto out; } diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index c9993ff66fc2..e6c54c6ef0f5 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -138,37 +138,6 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct inode *dir, goto out; } -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, - struct dentry **newdentry, umode_t mode) -{ - int err; - struct dentry *d, *dentry = *newdentry; - - err = ovl_do_mkdir(ofs, dir, dentry, mode); - if (err) - return err; - - if (likely(!d_unhashed(dentry))) - return 0; - - /* - * vfs_mkdir() may succeed and leave the dentry passed - * to it unhashed and negative. If that happens, try to - * lookup a new hashed and positive dentry. - */ - d = ovl_lookup_upper(ofs, dentry->d_name.name, dentry->d_parent, - dentry->d_name.len); - if (IS_ERR(d)) { - pr_warn("failed lookup after mkdir (%pd2, err=%i).\n", - dentry, err); - return PTR_ERR(d); - } - dput(dentry); - *newdentry = d; - - return 0; -} - struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, struct dentry *newdentry, struct ovl_cattr *attr) { @@ -191,7 +160,7 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, case S_IFDIR: /* mkdir is special... */ - err = ovl_mkdir_real(ofs, dir, &newdentry, attr->mode); + err = ovl_do_mkdir(ofs, dir, &newdentry, attr->mode); break; case S_IFCHR: diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h index 0021e2025020..967870f12482 100644 --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -242,11 +242,11 @@ static inline int ovl_do_create(struct ovl_fs *ofs, } static inline int ovl_do_mkdir(struct ovl_fs *ofs, - struct inode *dir, struct dentry *dentry, + struct inode *dir, struct dentry **dentry, umode_t mode) { - int err = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); - pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, err); + int err = vfs_mkdir_return(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); + pr_debug("mkdir(%pd2, 0%o) = %i\n", *dentry, mode, err); return err; } @@ -838,8 +838,8 @@ struct ovl_cattr { #define OVL_CATTR(m) (&(struct ovl_cattr) { .mode = (m) }) -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, - struct dentry **newdentry, umode_t mode); +int ovl_do_mkdir(struct ovl_fs *ofs, struct inode *dir, + struct dentry **newdentry, umode_t mode); struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, struct dentry *newdentry, struct ovl_cattr *attr); diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index 86ae6f6da36b..06ca8b01c336 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -327,7 +327,7 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs, goto retry; } - err = ovl_mkdir_real(ofs, dir, &work, attr.ia_mode); + err = ovl_do_mkdir(ofs, dir, &work, attr.ia_mode); if (err) goto out_dput; diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c index 6890016e1923..4e580bb7baf8 100644 --- a/fs/smb/server/vfs.c +++ b/fs/smb/server/vfs.c @@ -211,7 +211,7 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) { struct mnt_idmap *idmap; struct path path; - struct dentry *dentry; + struct dentry *dentry, *d; int err; dentry = ksmbd_vfs_kern_path_create(work, name, @@ -227,27 +227,11 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) idmap = mnt_idmap(path.mnt); mode |= S_IFDIR; - err = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); - if (!err && d_unhashed(dentry)) { - struct dentry *d; - - d = lookup_one(idmap, dentry->d_name.name, dentry->d_parent, - dentry->d_name.len); - if (IS_ERR(d)) { - err = PTR_ERR(d); - goto out_err; - } - if (unlikely(d_is_negative(d))) { - dput(d); - err = -ENOENT; - goto out_err; - } - + d = dentry; + err = vfs_mkdir_return(idmap, d_inode(path.dentry), &dentry, mode); + if (!err && dentry != d) ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(d)); - dput(d); - } -out_err: done_path_create(&path, dentry); if (err) pr_err("mkdir(%s): creation failed (err:%d)\n", name, err); diff --git a/include/linux/fs.h b/include/linux/fs.h index be3ad155ec9f..f81d6bc65fe4 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1971,6 +1971,8 @@ int vfs_create(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, bool); int vfs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, umode_t); +int vfs_mkdir_return(struct mnt_idmap *, struct inode *, + struct dentry **, umode_t); int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, dev_t); int vfs_symlink(struct mnt_idmap *, struct inode *, From patchwork Thu Feb 6 05:42:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962212 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 82AA01F60A; Thu, 6 Feb 2025 05:45:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820743; cv=none; b=RHYOCDLy5mROdqdOGFv0CL6xfjXh2+qFfLY5ScH2WBWaJUoi02CxFtixyzAZG4LJ3/y3HhrBvnwQm73a08NAiSuz8Vd7STfdSNLqFljxC8B4CDwpgVbdGtPpWuU7bUHQMdwOBVThS5L3Epr95MtVa8EYb3jbyHeBXYTa+lggBtU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820743; c=relaxed/simple; bh=+FLiyGgdWuvivHekHFyyWG0dyFY9K1XzzQbcMA079Zw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VMZi+Dl+gyOd+5LtEj+00LxAf9CBI/F1Lb+GsXMOUfIRYTmkDJWOwa6aRYZwvhY8x72B0YCThaCVQPS9UIBob4KncM4YF9J0j8oz5gr2c1JGjbsMNzyKUJrlObQ1ZgySYrB3tRVTdKxxKnRdbF2tIASVYB5vQwPws4DvPnn1JU4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=OXgqSIvv; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=CEGTME0U; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=OXgqSIvv; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=CEGTME0U; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="OXgqSIvv"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="CEGTME0U"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="OXgqSIvv"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="CEGTME0U" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id B1D5621108; Thu, 6 Feb 2025 05:45:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820738; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V4m+yfdJXG9Rz2h1YNdsawYoIyOXZNOniMpxRTLwcPo=; b=OXgqSIvv0P8oFsnVgdjqsctj9fsFk8/8x10FLkUX2ReKhc8WgQlSDvJE8dvfOk1lbA5QK0 dFxG55kLaxl7eDZb7O4sPU2GJp2+p6HRBwWSkM0BhNNwnMwhHucPOpVAlscUO9fhHXXUq6 fyh9zicSz+izN1/NSfCCeACkX5gTm68= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820738; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V4m+yfdJXG9Rz2h1YNdsawYoIyOXZNOniMpxRTLwcPo=; b=CEGTME0Ua+aVfL5jWa3k3Qjetq81fN40m8qYWovWAzrh7tDZar2VgyujLzNOoECiqpWkGE KdjrOHKhmrwbH4DQ== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820738; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V4m+yfdJXG9Rz2h1YNdsawYoIyOXZNOniMpxRTLwcPo=; b=OXgqSIvv0P8oFsnVgdjqsctj9fsFk8/8x10FLkUX2ReKhc8WgQlSDvJE8dvfOk1lbA5QK0 dFxG55kLaxl7eDZb7O4sPU2GJp2+p6HRBwWSkM0BhNNwnMwhHucPOpVAlscUO9fhHXXUq6 fyh9zicSz+izN1/NSfCCeACkX5gTm68= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820738; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V4m+yfdJXG9Rz2h1YNdsawYoIyOXZNOniMpxRTLwcPo=; b=CEGTME0Ua+aVfL5jWa3k3Qjetq81fN40m8qYWovWAzrh7tDZar2VgyujLzNOoECiqpWkGE KdjrOHKhmrwbH4DQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id A741113795; Thu, 6 Feb 2025 05:45:35 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id bYDTFn9MpGc7BwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:45:35 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 02/19] VFS: use global wait-queue table for d_alloc_parallel() Date: Thu, 6 Feb 2025 16:42:39 +1100 Message-ID: <20250206054504.2950516-3-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Level: d_alloc_parallel() currently requires a wait_queue_head to be passed in. This must have a life time which extends until the lookup is completed. Future proposed patches will use d_alloc_parallel() for names being created/unlinked etc. Some filesystems combine lookup with create making a longer code path that the wq needs to live for. If it is still to be allocated on-stack this can be cumbersome. This patch replaces the on-stack wqs with a global array of wqs which are used as needed. A wq is NOT allocated when a dentry is first created but only when a second thread attempts to use the same name and so is forced to wait. At this moment a wq is chosen using the least-significant bits on the task's pid and that wq is assigned to ->d_wait. The ->d_lock is then dropped and the task waits. When the dentry is finally moved out of "in_lookup" a wake up is only sent if ->d_wait is not NULL. This avoids an (uncontended) spin lock/unlock which saves a couple of atomic operations in a common case. The wake up passes the dentry that the wake up is for as the "key" and the waiter will only wake processes waiting on the same key. This means that when these global waitqueues are shared (which is inevitable though unlikely to be frequent), a task will not be woken prematurely. Signed-off-by: NeilBrown --- fs/afs/dir_silly.c | 4 +-- fs/dcache.c | 69 +++++++++++++++++++++++++++++++++-------- fs/fuse/readdir.c | 3 +- fs/namei.c | 6 ++-- fs/nfs/dir.c | 6 ++-- fs/nfs/unlink.c | 3 +- fs/proc/base.c | 3 +- fs/proc/proc_sysctl.c | 3 +- fs/smb/client/readdir.c | 3 +- include/linux/dcache.h | 3 +- include/linux/nfs_xdr.h | 1 - 11 files changed, 67 insertions(+), 37 deletions(-) diff --git a/fs/afs/dir_silly.c b/fs/afs/dir_silly.c index a1e581946b93..aa4363a1c6fa 100644 --- a/fs/afs/dir_silly.c +++ b/fs/afs/dir_silly.c @@ -239,13 +239,11 @@ int afs_silly_iput(struct dentry *dentry, struct inode *inode) struct dentry *alias; int ret; - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); - _enter("%p{%pd},%llx", dentry, dentry, vnode->fid.vnode); down_read(&dvnode->rmdir_lock); - alias = d_alloc_parallel(dentry->d_parent, &dentry->d_name, &wq); + alias = d_alloc_parallel(dentry->d_parent, &dentry->d_name); if (IS_ERR(alias)) { up_read(&dvnode->rmdir_lock); return 0; diff --git a/fs/dcache.c b/fs/dcache.c index 96b21a47312e..e49607d00d2d 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2095,8 +2095,7 @@ struct dentry *d_add_ci(struct dentry *dentry, struct inode *inode, return found; } if (d_in_lookup(dentry)) { - found = d_alloc_parallel(dentry->d_parent, name, - dentry->d_wait); + found = d_alloc_parallel(dentry->d_parent, name); if (IS_ERR(found) || !d_in_lookup(found)) { iput(inode); return found; @@ -2106,7 +2105,7 @@ struct dentry *d_add_ci(struct dentry *dentry, struct inode *inode, if (!found) { iput(inode); return ERR_PTR(-ENOMEM); - } + } } res = d_splice_alias(inode, found); if (res) { @@ -2476,30 +2475,70 @@ static inline unsigned start_dir_add(struct inode *dir) } static inline void end_dir_add(struct inode *dir, unsigned int n, - wait_queue_head_t *d_wait) + wait_queue_head_t *d_wait, struct dentry *de) { smp_store_release(&dir->i_dir_seq, n + 2); preempt_enable_nested(); - wake_up_all(d_wait); + if (d_wait) + __wake_up(d_wait, TASK_NORMAL, 0, de); +} + +#define PAR_LOOKUP_WQS 256 +static wait_queue_head_t par_wait_table[PAR_LOOKUP_WQS] __cacheline_aligned; + +static int __init par_wait_init(void) +{ + int i; + + for (i = 0; i < PAR_LOOKUP_WQS; i++) + init_waitqueue_head(&par_wait_table[i]); + return 0; +} +fs_initcall(par_wait_init); + +struct par_wait_key { + struct dentry *de; + struct wait_queue_entry wqe; +}; + +static int d_wait_wake_fn(struct wait_queue_entry *wq_entry, + unsigned mode, int sync, void *key) +{ + struct par_wait_key *pwk = container_of(wq_entry, + struct par_wait_key, wqe); + if (pwk->de == key) + return default_wake_function(wq_entry, mode, sync, key); + return 0; } static void d_wait_lookup(struct dentry *dentry) { if (d_in_lookup(dentry)) { - DECLARE_WAITQUEUE(wait, current); - add_wait_queue(dentry->d_wait, &wait); + struct par_wait_key wk = { + .de = dentry, + .wqe = { + .private = current, + .func = d_wait_wake_fn, + }, + }; + struct wait_queue_head *wq; + if (!dentry->d_wait) + dentry->d_wait = &par_wait_table[current->pid % + PAR_LOOKUP_WQS]; + wq = dentry->d_wait; + add_wait_queue(wq, &wk.wqe); do { set_current_state(TASK_UNINTERRUPTIBLE); spin_unlock(&dentry->d_lock); schedule(); spin_lock(&dentry->d_lock); } while (d_in_lookup(dentry)); + remove_wait_queue(wq, &wk.wqe); } } struct dentry *d_alloc_parallel(struct dentry *parent, - const struct qstr *name, - wait_queue_head_t *wq) + const struct qstr *name) { unsigned int hash = name->hash; struct hlist_bl_head *b = in_lookup_hash(parent, hash); @@ -2596,7 +2635,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent, rcu_read_unlock(); /* we can't take ->d_lock here; it's OK, though. */ new->d_flags |= DCACHE_PAR_LOOKUP; - new->d_wait = wq; + new->d_wait = NULL; hlist_bl_add_head(&new->d_u.d_in_lookup_hash, b); hlist_bl_unlock(b); return new; @@ -2633,8 +2672,12 @@ static wait_queue_head_t *__d_lookup_unhash(struct dentry *dentry) void __d_lookup_unhash_wake(struct dentry *dentry) { + wait_queue_head_t *d_wait; + spin_lock(&dentry->d_lock); - wake_up_all(__d_lookup_unhash(dentry)); + d_wait = __d_lookup_unhash(dentry); + if (d_wait) + __wake_up(d_wait, TASK_NORMAL, 0, dentry); spin_unlock(&dentry->d_lock); } EXPORT_SYMBOL(__d_lookup_unhash_wake); @@ -2662,7 +2705,7 @@ static inline void __d_add(struct dentry *dentry, struct inode *inode) } __d_rehash(dentry); if (dir) - end_dir_add(dir, n, d_wait); + end_dir_add(dir, n, d_wait, dentry); spin_unlock(&dentry->d_lock); if (inode) spin_unlock(&inode->i_lock); @@ -2874,7 +2917,7 @@ static void __d_move(struct dentry *dentry, struct dentry *target, write_seqcount_end(&dentry->d_seq); if (dir) - end_dir_add(dir, n, d_wait); + end_dir_add(dir, n, d_wait, target); if (dentry->d_parent != old_parent) spin_unlock(&dentry->d_parent->d_lock); diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c index 17ce9636a2b1..c6b646a3f1bd 100644 --- a/fs/fuse/readdir.c +++ b/fs/fuse/readdir.c @@ -160,7 +160,6 @@ static int fuse_direntplus_link(struct file *file, struct inode *dir = d_inode(parent); struct fuse_conn *fc; struct inode *inode; - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); if (!o->nodeid) { /* @@ -195,7 +194,7 @@ static int fuse_direntplus_link(struct file *file, dentry = d_lookup(parent, &name); if (!dentry) { retry: - dentry = d_alloc_parallel(parent, &name, &wq); + dentry = d_alloc_parallel(parent, &name); if (IS_ERR(dentry)) return PTR_ERR(dentry); } diff --git a/fs/namei.c b/fs/namei.c index d98caf36e867..5cdbd2eb4056 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1769,13 +1769,12 @@ static struct dentry *__lookup_slow(const struct qstr *name, { struct dentry *dentry, *old; struct inode *inode = dir->d_inode; - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); /* Don't go there if it's already dead */ if (unlikely(IS_DEADDIR(inode))) return ERR_PTR(-ENOENT); again: - dentry = d_alloc_parallel(dir, name, &wq); + dentry = d_alloc_parallel(dir, name); if (IS_ERR(dentry)) return dentry; if (unlikely(!d_in_lookup(dentry))) { @@ -3561,7 +3560,6 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, struct dentry *dentry; int error, create_error = 0; umode_t mode = op->mode; - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); if (unlikely(IS_DEADDIR(dir_inode))) return ERR_PTR(-ENOENT); @@ -3570,7 +3568,7 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, dentry = d_lookup(dir, &nd->last); for (;;) { if (!dentry) { - dentry = d_alloc_parallel(dir, &nd->last, &wq); + dentry = d_alloc_parallel(dir, &nd->last); if (IS_ERR(dentry)) return dentry; } diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 2b04038b0e40..27c7a5c4e91b 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -725,7 +725,6 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry, unsigned long dir_verifier) { struct qstr filename = QSTR_INIT(entry->name, entry->len); - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); struct dentry *dentry; struct dentry *alias; struct inode *inode; @@ -754,7 +753,7 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry, dentry = d_lookup(parent, &filename); again: if (!dentry) { - dentry = d_alloc_parallel(parent, &filename, &wq); + dentry = d_alloc_parallel(parent, &filename); if (IS_ERR(dentry)) return; } @@ -2059,7 +2058,6 @@ int nfs_atomic_open(struct inode *dir, struct dentry *dentry, struct file *file, unsigned open_flags, umode_t mode) { - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); struct nfs_open_context *ctx; struct dentry *res; struct iattr attr = { .ia_valid = ATTR_OPEN }; @@ -2115,7 +2113,7 @@ int nfs_atomic_open(struct inode *dir, struct dentry *dentry, d_drop(dentry); switched = true; dentry = d_alloc_parallel(dentry->d_parent, - &dentry->d_name, &wq); + &dentry->d_name); if (IS_ERR(dentry)) return PTR_ERR(dentry); if (unlikely(!d_in_lookup(dentry))) diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c index bf77399696a7..d44162d3a8f1 100644 --- a/fs/nfs/unlink.c +++ b/fs/nfs/unlink.c @@ -124,7 +124,7 @@ static int nfs_call_unlink(struct dentry *dentry, struct inode *inode, struct nf struct dentry *alias; down_read_non_owner(&NFS_I(dir)->rmdir_sem); - alias = d_alloc_parallel(dentry->d_parent, &data->args.name, &data->wq); + alias = d_alloc_parallel(dentry->d_parent, &data->args.name); if (IS_ERR(alias)) { up_read_non_owner(&NFS_I(dir)->rmdir_sem); return 0; @@ -185,7 +185,6 @@ nfs_async_unlink(struct dentry *dentry, const struct qstr *name) data->cred = get_current_cred(); data->res.dir_attr = &data->dir_attr; - init_waitqueue_head(&data->wq); status = -EBUSY; spin_lock(&dentry->d_lock); diff --git a/fs/proc/base.c b/fs/proc/base.c index cd89e956c322..c8bcbdac87d5 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -2126,8 +2126,7 @@ bool proc_fill_cache(struct file *file, struct dir_context *ctx, child = d_hash_and_lookup(dir, &qname); if (!child) { - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); - child = d_alloc_parallel(dir, &qname, &wq); + child = d_alloc_parallel(dir, &qname); if (IS_ERR(child)) goto end_instantiate; if (d_in_lookup(child)) { diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index cc9d74a06ff0..9f1088f138f4 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -693,8 +693,7 @@ static bool proc_sys_fill_cache(struct file *file, child = d_lookup(dir, &qname); if (!child) { - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); - child = d_alloc_parallel(dir, &qname, &wq); + child = d_alloc_parallel(dir, &qname); if (IS_ERR(child)) return false; if (d_in_lookup(child)) { diff --git a/fs/smb/client/readdir.c b/fs/smb/client/readdir.c index 50f96259d9ad..39d8a18cd443 100644 --- a/fs/smb/client/readdir.c +++ b/fs/smb/client/readdir.c @@ -73,7 +73,6 @@ cifs_prime_dcache(struct dentry *parent, struct qstr *name, struct cifs_sb_info *cifs_sb = CIFS_SB(sb); bool posix = cifs_sb_master_tcon(cifs_sb)->posix_extensions; bool reparse_need_reval = false; - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); int rc; cifs_dbg(FYI, "%s: for %s\n", __func__, name->name); @@ -105,7 +104,7 @@ cifs_prime_dcache(struct dentry *parent, struct qstr *name, (fattr->cf_flags & CIFS_FATTR_NEED_REVAL)) return; - dentry = d_alloc_parallel(parent, name, &wq); + dentry = d_alloc_parallel(parent, name); } if (IS_ERR(dentry)) return; diff --git a/include/linux/dcache.h b/include/linux/dcache.h index 4afb60365675..b03cbb0177a3 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -247,8 +247,7 @@ extern void d_set_d_op(struct dentry *dentry, const struct dentry_operations *op /* allocate/de-allocate */ extern struct dentry * d_alloc(struct dentry *, const struct qstr *); extern struct dentry * d_alloc_anon(struct super_block *); -extern struct dentry * d_alloc_parallel(struct dentry *, const struct qstr *, - wait_queue_head_t *); +extern struct dentry * d_alloc_parallel(struct dentry *, const struct qstr *); extern struct dentry * d_splice_alias(struct inode *, struct dentry *); extern struct dentry * d_add_ci(struct dentry *, struct inode *, struct qstr *); extern bool d_same_name(const struct dentry *dentry, const struct dentry *parent, diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index 9155a6ffc370..d0473e0d4aba 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -1731,7 +1731,6 @@ struct nfs_unlinkdata { struct nfs_removeargs args; struct nfs_removeres res; struct dentry *dentry; - wait_queue_head_t wq; const struct cred *cred; struct nfs_fattr dir_attr; long timeout; From patchwork Thu Feb 6 05:42:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962213 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 01911223326; Thu, 6 Feb 2025 05:45:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820756; cv=none; b=mAT0M0YWHC8cItP18dhGSzCOxuObVG/AE7c2zNt76A6afN4pk1P4K3DqeU37Pp5FXRI+raFfV5XbBFMLoVGMXEAjGEmAFtfP6JwORkjVSIPBRVm7wVzzryCWsWU8Pg7Y1IwI+Bx3JVFEN8EO2shczliP+6X/5IKed0qIGdbEwbE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820756; c=relaxed/simple; bh=GQsP3oUsMZBcSdxCqZVtkC71SFLhhLzFmy3z7fFKcg0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=W4uxcbmZhTDRq9Qh7u/uAZm7HgBjkGJY9CVjX/ROw9+QH12b984M6C+frlNUYgr+CRORsc1UId1yWOENjy8qXbqsi6vUEir6iAYOOqWboLaV2JbgmcrM0SvuSKq35gPBG++fxAdDzf31txX+qOGwqC1LcR+91S8oOUSQjPGJTrA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=PElTk/1U; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=jDPXUkMD; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=PElTk/1U; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=jDPXUkMD; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="PElTk/1U"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="jDPXUkMD"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="PElTk/1U"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="jDPXUkMD" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 01CE11F381; Thu, 6 Feb 2025 05:45:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820753; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DpRrOyHext/vKwSVHPvryUafdoA8729VpIBcf15u5hs=; b=PElTk/1UtEhyEl4c3urC3M3tsst2UfqiVeyxF8WArIIJthbzDknTwK9dUBJtIn9+xvFoEQ UzvLorrt52p3mENGcc14wceWCCNHxbtBPNXT6/4xB5+23gv/Thqtlttv2+JRZRwnf2oPu/ TRfM08pIzBpHCchHP6AtkLZTvAiAU3s= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820753; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DpRrOyHext/vKwSVHPvryUafdoA8729VpIBcf15u5hs=; b=jDPXUkMD3jqZBXgT93yBd/R5KuvaLbCi3tjG4YR/LxsOJI7Qgx+vOKGJuH0rQRXw1R3SX4 SrieCb4N8YPmcQAw== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820753; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DpRrOyHext/vKwSVHPvryUafdoA8729VpIBcf15u5hs=; b=PElTk/1UtEhyEl4c3urC3M3tsst2UfqiVeyxF8WArIIJthbzDknTwK9dUBJtIn9+xvFoEQ UzvLorrt52p3mENGcc14wceWCCNHxbtBPNXT6/4xB5+23gv/Thqtlttv2+JRZRwnf2oPu/ TRfM08pIzBpHCchHP6AtkLZTvAiAU3s= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820753; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DpRrOyHext/vKwSVHPvryUafdoA8729VpIBcf15u5hs=; b=jDPXUkMD3jqZBXgT93yBd/R5KuvaLbCi3tjG4YR/LxsOJI7Qgx+vOKGJuH0rQRXw1R3SX4 SrieCb4N8YPmcQAw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 3FFC613795; Thu, 6 Feb 2025 05:45:49 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 39U5OY1MpGdeBwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:45:49 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 03/19] VFS: use d_alloc_parallel() in lookup_one_qstr_excl() and rename it. Date: Thu, 6 Feb 2025 16:42:40 +1100 Message-ID: <20250206054504.2950516-4-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Level: X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,suse.de:mid]; RCVD_TLS_ALL(0.00)[] X-Spam-Score: -2.80 X-Spam-Flag: NO lookup_one_qstr_excl() is used for lookups prior to directory modifications, whether create, unlink, rename, or whatever. To prepare for allowing modification to happen in parallel, change lookup_one_qstr_excl() to use d_alloc_parallel(). To reflect this, name is changed to lookup_one_qtr() - as the directory may be locked shared. If any for the "intent" LOOKUP flags are passed, the caller must ensure d_lookup_done() is called at an appropriate time. If none are passed then we can be sure ->lookup() will do a real lookup and d_lookup_done() is called internally. Signed-off-by: NeilBrown --- fs/namei.c | 47 +++++++++++++++++++++++++------------------ fs/smb/server/vfs.c | 7 ++++--- include/linux/namei.h | 9 ++++++--- 3 files changed, 37 insertions(+), 26 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 5cdbd2eb4056..d684102d873d 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1665,15 +1665,13 @@ static struct dentry *lookup_dcache(const struct qstr *name, } /* - * Parent directory has inode locked exclusive. This is one - * and only case when ->lookup() gets called on non in-lookup - * dentries - as the matter of fact, this only gets called - * when directory is guaranteed to have no in-lookup children - * at all. + * Parent directory has inode locked: exclusive or shared. + * If @flags contains any LOOKUP_INTENT_FLAGS then d_lookup_done() + * must be called after the intended operation is performed - or aborted. */ -struct dentry *lookup_one_qstr_excl(const struct qstr *name, - struct dentry *base, - unsigned int flags) +struct dentry *lookup_one_qstr(const struct qstr *name, + struct dentry *base, + unsigned int flags) { struct dentry *dentry = lookup_dcache(name, base, flags); struct dentry *old; @@ -1686,18 +1684,25 @@ struct dentry *lookup_one_qstr_excl(const struct qstr *name, if (unlikely(IS_DEADDIR(dir))) return ERR_PTR(-ENOENT); - dentry = d_alloc(base, name); - if (unlikely(!dentry)) + dentry = d_alloc_parallel(base, name); + if (unlikely(IS_ERR_OR_NULL(dentry))) return ERR_PTR(-ENOMEM); + if (!d_in_lookup(dentry)) + /* Raced with another thread which did the lookup */ + return dentry; old = dir->i_op->lookup(dir, dentry, flags); if (unlikely(old)) { + d_lookup_done(dentry); dput(dentry); dentry = old; } + if ((flags & LOOKUP_INTENT_FLAGS) == 0) + /* ->lookup must have given final answer */ + d_lookup_done(dentry); return dentry; } -EXPORT_SYMBOL(lookup_one_qstr_excl); +EXPORT_SYMBOL(lookup_one_qstr); /** * lookup_fast - do fast lockless (but racy) lookup of a dentry @@ -2739,7 +2744,7 @@ static struct dentry *__kern_path_locked(int dfd, struct filename *name, struct return ERR_PTR(-EINVAL); } inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); - d = lookup_one_qstr_excl(&last, path->dentry, 0); + d = lookup_one_qstr(&last, path->dentry, 0); if (IS_ERR(d)) { inode_unlock(path->dentry->d_inode); path_put(path); @@ -4078,8 +4083,8 @@ static struct dentry *filename_create(int dfd, struct filename *name, if (last.name[last.len] && !want_dir) create_flags = 0; inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); - dentry = lookup_one_qstr_excl(&last, path->dentry, - reval_flag | create_flags); + dentry = lookup_one_qstr(&last, path->dentry, + reval_flag | create_flags); if (IS_ERR(dentry)) goto unlock; @@ -4103,6 +4108,7 @@ static struct dentry *filename_create(int dfd, struct filename *name, } return dentry; fail: + d_lookup_done(dentry); dput(dentry); dentry = ERR_PTR(error); unlock: @@ -4508,7 +4514,7 @@ int do_rmdir(int dfd, struct filename *name) goto exit2; inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT); - dentry = lookup_one_qstr_excl(&last, path.dentry, lookup_flags); + dentry = lookup_one_qstr(&last, path.dentry, lookup_flags); error = PTR_ERR(dentry); if (IS_ERR(dentry)) goto exit3; @@ -4641,7 +4647,7 @@ int do_unlinkat(int dfd, struct filename *name) goto exit2; retry_deleg: inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT); - dentry = lookup_one_qstr_excl(&last, path.dentry, lookup_flags); + dentry = lookup_one_qstr(&last, path.dentry, lookup_flags); error = PTR_ERR(dentry); if (!IS_ERR(dentry)) { @@ -5231,8 +5237,8 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, goto exit_lock_rename; } - old_dentry = lookup_one_qstr_excl(&old_last, old_path.dentry, - lookup_flags); + old_dentry = lookup_one_qstr(&old_last, old_path.dentry, + lookup_flags); error = PTR_ERR(old_dentry); if (IS_ERR(old_dentry)) goto exit3; @@ -5240,8 +5246,8 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, error = -ENOENT; if (d_is_negative(old_dentry)) goto exit4; - new_dentry = lookup_one_qstr_excl(&new_last, new_path.dentry, - lookup_flags | target_flags); + new_dentry = lookup_one_qstr(&new_last, new_path.dentry, + lookup_flags | target_flags); error = PTR_ERR(new_dentry); if (IS_ERR(new_dentry)) goto exit4; @@ -5292,6 +5298,7 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, rd.flags = flags; error = vfs_rename(&rd); exit5: + d_lookup_done(new_dentry); dput(new_dentry); exit4: dput(old_dentry); diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c index 4e580bb7baf8..89b3823f6405 100644 --- a/fs/smb/server/vfs.c +++ b/fs/smb/server/vfs.c @@ -109,7 +109,7 @@ static int ksmbd_vfs_path_lookup_locked(struct ksmbd_share_config *share_conf, } inode_lock_nested(parent_path->dentry->d_inode, I_MUTEX_PARENT); - d = lookup_one_qstr_excl(&last, parent_path->dentry, 0); + d = lookup_one_qstr(&last, parent_path->dentry, 0); if (IS_ERR(d)) goto err_out; @@ -726,8 +726,8 @@ int ksmbd_vfs_rename(struct ksmbd_work *work, const struct path *old_path, ksmbd_fd_put(work, parent_fp); } - new_dentry = lookup_one_qstr_excl(&new_last, new_path.dentry, - lookup_flags | LOOKUP_RENAME_TARGET); + new_dentry = lookup_one_qstr(&new_last, new_path.dentry, + lookup_flags | LOOKUP_RENAME_TARGET); if (IS_ERR(new_dentry)) { err = PTR_ERR(new_dentry); goto out3; @@ -771,6 +771,7 @@ int ksmbd_vfs_rename(struct ksmbd_work *work, const struct path *old_path, ksmbd_debug(VFS, "vfs_rename failed err %d\n", err); out4: + d_lookup_done(new_dentry); dput(new_dentry); out3: dput(old_parent); diff --git a/include/linux/namei.h b/include/linux/namei.h index 8ec8fed3bce8..06bb3ea65beb 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -34,6 +34,9 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT}; #define LOOKUP_EXCL 0x0400 /* ... in exclusive creation */ #define LOOKUP_RENAME_TARGET 0x0800 /* ... in destination of rename() */ +#define LOOKUP_INTENT_FLAGS (LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_EXCL | \ + LOOKUP_RENAME_TARGET) + /* internal use only */ #define LOOKUP_PARENT 0x0010 @@ -52,9 +55,9 @@ extern int path_pts(struct path *path); extern int user_path_at(int, const char __user *, unsigned, struct path *); -struct dentry *lookup_one_qstr_excl(const struct qstr *name, - struct dentry *base, - unsigned int flags); +struct dentry *lookup_one_qstr(const struct qstr *name, + struct dentry *base, + unsigned int flags); extern int kern_path(const char *, unsigned, struct path *); extern struct dentry *kern_path_create(int, const char *, struct path *, unsigned int); From patchwork Thu Feb 6 05:42:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962214 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 094B91B59A; Thu, 6 Feb 2025 05:46:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820770; cv=none; b=td7N1+nPFYibLgA+pJbjkVB+aGex8wBZ11Un++PpgBOjg7IJgQCsV5LaW8LAWMsC3dhlDkNCMvTs1qvLy+3vH1FfSNcrHqET1LIdj8xiGHwJVcwgMeaJGNnw5dZGUxusN/yv2kr0N6EZdEtkjmxAl6ChaA8OYojx4UCL0vbplU4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820770; c=relaxed/simple; bh=IFL6e9skluEBA+ZHKwDT0D/ofuadconVJvzIi3nlTQg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OwdHWLnK548egDtqqyhfPaU0+s7ntPTZGtLFK2NvYTvG6uc1rhA7QDS5kcU6v6rXdYCUblY8Y84QM4ciTPNDJ+ytkAbK/mKnO8yiV/HCL2hdMVQR1s7kbBIZCVKn8FZBt0hTxII3DmBVZJ5Kvg9FzkPLITZx2FWCM4iBfPKTN3Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=o1yycAys; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=5ZZ5GMuX; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=o1yycAys; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=5ZZ5GMuX; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="o1yycAys"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="5ZZ5GMuX"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="o1yycAys"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="5ZZ5GMuX" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 3BDD81F381; Thu, 6 Feb 2025 05:46:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820767; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4UdCFI90AtbHekiiPg50qrTWMAXWD+8+25n2lXUv+JU=; b=o1yycAysmdRk1yYMgtMnodZmvLb+bz0seyWMMYCdyNAtyamJxYSnE7q8nSQRvjotLrNkpm 0Jl7t8wM/AqHlHx+vij1u8XEmM8IBjzwRgr2FLOgFMDLE5suEhcna7A8z5E0afMNQ1aNWv 4mNBe0AOf/MqtkNeikFRWWBoAGG3CDA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820767; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4UdCFI90AtbHekiiPg50qrTWMAXWD+8+25n2lXUv+JU=; b=5ZZ5GMuXeGmLdeTzwcHMfltWc1nqxOjIkqnApXi+PI3XAxyeJRNmnlbCR7IdC7VqPwwSfX FmIZSMLoLETIpeCw== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=o1yycAys; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=5ZZ5GMuX DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820767; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4UdCFI90AtbHekiiPg50qrTWMAXWD+8+25n2lXUv+JU=; b=o1yycAysmdRk1yYMgtMnodZmvLb+bz0seyWMMYCdyNAtyamJxYSnE7q8nSQRvjotLrNkpm 0Jl7t8wM/AqHlHx+vij1u8XEmM8IBjzwRgr2FLOgFMDLE5suEhcna7A8z5E0afMNQ1aNWv 4mNBe0AOf/MqtkNeikFRWWBoAGG3CDA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820767; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4UdCFI90AtbHekiiPg50qrTWMAXWD+8+25n2lXUv+JU=; b=5ZZ5GMuXeGmLdeTzwcHMfltWc1nqxOjIkqnApXi+PI3XAxyeJRNmnlbCR7IdC7VqPwwSfX FmIZSMLoLETIpeCw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 7B1C113795; Thu, 6 Feb 2025 05:46:04 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id floTDJxMpGdyBwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:46:04 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 04/19] VFS: change kern_path_locked() and user_path_locked_at() to never return negative dentry Date: Thu, 6 Feb 2025 16:42:41 +1100 Message-ID: <20250206054504.2950516-5-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 3BDD81F381 X-Spam-Level: X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,suse.de:dkim,suse.de:mid] X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -3.01 X-Spam-Flag: NO No callers of kern_path_locked() or user_path_locked_at() want a negative dentry. So change them to return -ENOENT instead. This simplifies callers. This results in a subtle change to bcachefs in that an ioctl will now return -ENOENT in preference to -EXDEV. I believe this restores the behaviour to what it was prior to Commit bbe6a7c899e7 ("bch2_ioctl_subvolume_destroy(): fix locking") Signed-off-by: NeilBrown --- drivers/base/devtmpfs.c | 65 +++++++++++++++++++---------------------- fs/bcachefs/fs-ioctl.c | 4 --- fs/namei.c | 4 +++ kernel/audit_watch.c | 12 ++++---- 4 files changed, 40 insertions(+), 45 deletions(-) diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c index b848764ef018..c9e34842139f 100644 --- a/drivers/base/devtmpfs.c +++ b/drivers/base/devtmpfs.c @@ -245,15 +245,12 @@ static int dev_rmdir(const char *name) dentry = kern_path_locked(name, &parent); if (IS_ERR(dentry)) return PTR_ERR(dentry); - if (d_really_is_positive(dentry)) { - if (d_inode(dentry)->i_private == &thread) - err = vfs_rmdir(&nop_mnt_idmap, d_inode(parent.dentry), - dentry); - else - err = -EPERM; - } else { - err = -ENOENT; - } + if (d_inode(dentry)->i_private == &thread) + err = vfs_rmdir(&nop_mnt_idmap, d_inode(parent.dentry), + dentry); + else + err = -EPERM; + dput(dentry); inode_unlock(d_inode(parent.dentry)); path_put(&parent); @@ -310,6 +307,8 @@ static int handle_remove(const char *nodename, struct device *dev) { struct path parent; struct dentry *dentry; + struct kstat stat; + struct path p; int deleted = 0; int err; @@ -317,32 +316,28 @@ static int handle_remove(const char *nodename, struct device *dev) if (IS_ERR(dentry)) return PTR_ERR(dentry); - if (d_really_is_positive(dentry)) { - struct kstat stat; - struct path p = {.mnt = parent.mnt, .dentry = dentry}; - err = vfs_getattr(&p, &stat, STATX_TYPE | STATX_MODE, - AT_STATX_SYNC_AS_STAT); - if (!err && dev_mynode(dev, d_inode(dentry), &stat)) { - struct iattr newattrs; - /* - * before unlinking this node, reset permissions - * of possible references like hardlinks - */ - newattrs.ia_uid = GLOBAL_ROOT_UID; - newattrs.ia_gid = GLOBAL_ROOT_GID; - newattrs.ia_mode = stat.mode & ~0777; - newattrs.ia_valid = - ATTR_UID|ATTR_GID|ATTR_MODE; - inode_lock(d_inode(dentry)); - notify_change(&nop_mnt_idmap, dentry, &newattrs, NULL); - inode_unlock(d_inode(dentry)); - err = vfs_unlink(&nop_mnt_idmap, d_inode(parent.dentry), - dentry, NULL); - if (!err || err == -ENOENT) - deleted = 1; - } - } else { - err = -ENOENT; + p.mnt = parent.mnt; + p.dentry = dentry; + err = vfs_getattr(&p, &stat, STATX_TYPE | STATX_MODE, + AT_STATX_SYNC_AS_STAT); + if (!err && dev_mynode(dev, d_inode(dentry), &stat)) { + struct iattr newattrs; + /* + * before unlinking this node, reset permissions + * of possible references like hardlinks + */ + newattrs.ia_uid = GLOBAL_ROOT_UID; + newattrs.ia_gid = GLOBAL_ROOT_GID; + newattrs.ia_mode = stat.mode & ~0777; + newattrs.ia_valid = + ATTR_UID|ATTR_GID|ATTR_MODE; + inode_lock(d_inode(dentry)); + notify_change(&nop_mnt_idmap, dentry, &newattrs, NULL); + inode_unlock(d_inode(dentry)); + err = vfs_unlink(&nop_mnt_idmap, d_inode(parent.dentry), + dentry, NULL); + if (!err || err == -ENOENT) + deleted = 1; } dput(dentry); inode_unlock(d_inode(parent.dentry)); diff --git a/fs/bcachefs/fs-ioctl.c b/fs/bcachefs/fs-ioctl.c index 15725b4ce393..595b57fabc9a 100644 --- a/fs/bcachefs/fs-ioctl.c +++ b/fs/bcachefs/fs-ioctl.c @@ -511,10 +511,6 @@ static long bch2_ioctl_subvolume_destroy(struct bch_fs *c, struct file *filp, ret = -EXDEV; goto err; } - if (!d_is_positive(victim)) { - ret = -ENOENT; - goto err; - } ret = __bch2_unlink(dir, victim, true); if (!ret) { fsnotify_rmdir(dir, victim); diff --git a/fs/namei.c b/fs/namei.c index d684102d873d..1901120bcbb8 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2745,6 +2745,10 @@ static struct dentry *__kern_path_locked(int dfd, struct filename *name, struct } inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); d = lookup_one_qstr(&last, path->dentry, 0); + if (!IS_ERR(d) && d_is_negative(d)) { + dput(d); + d = ERR_PTR(-ENOENT); + } if (IS_ERR(d)) { inode_unlock(path->dentry->d_inode); path_put(path); diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c index 7f358740e958..e3130675ee6b 100644 --- a/kernel/audit_watch.c +++ b/kernel/audit_watch.c @@ -350,11 +350,10 @@ static int audit_get_nd(struct audit_watch *watch, struct path *parent) struct dentry *d = kern_path_locked(watch->path, parent); if (IS_ERR(d)) return PTR_ERR(d); - if (d_is_positive(d)) { - /* update watch filter fields */ - watch->dev = d->d_sb->s_dev; - watch->ino = d_backing_inode(d)->i_ino; - } + /* update watch filter fields */ + watch->dev = d->d_sb->s_dev; + watch->ino = d_backing_inode(d)->i_ino; + inode_unlock(d_backing_inode(parent->dentry)); dput(d); return 0; @@ -419,7 +418,7 @@ int audit_add_watch(struct audit_krule *krule, struct list_head **list) /* caller expects mutex locked */ mutex_lock(&audit_filter_mutex); - if (ret) { + if (ret && ret != -ENOENT) { audit_put_watch(watch); return ret; } @@ -438,6 +437,7 @@ int audit_add_watch(struct audit_krule *krule, struct list_head **list) h = audit_hash_ino((u32)watch->ino); *list = &audit_inode_hash[h]; + ret = 0; error: path_put(&parent_path); audit_put_watch(watch); From patchwork Thu Feb 6 05:42:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962215 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B1BA1B59A; Thu, 6 Feb 2025 05:46:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820777; cv=none; b=FLEZmTdaDKvze3JX3B2WEyYngveaN5LzD86Is5Jgli/Fr/k5jrptvQKpG37iYZtxFxk1rQ8EPiRu+I+ZwNZbo7V2DZy3LyKC128x7guOrFCr7w66KuAPoRYzS/U/ZJcdOiMfIzkmHNPnILmuFr0gORMfpOQA4FJXNtMoLzgHbnM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820777; c=relaxed/simple; bh=JkJvFV88L+Oz9pqbDobXWftglWw2WTtyc8jXH80FwFU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kj/xmecyCvPz2Y2ZmDjsAc3ByfdqfsB8oTR5imPol2tNkP+NYAkxTG/ov1o69PHvg8uwCtn7ZUXZl9GoVMpXgNUX+0X8c4kkhrwyRkYguU1LUCOBvHBCJV8MWCiLW3N88CJ2c4d5mbi7aDS++JKaX6dv50geAzm8g5PISedzHc8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=e+wtYod6; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=cAJbZx+Q; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=e+wtYod6; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=cAJbZx+Q; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="e+wtYod6"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="cAJbZx+Q"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="e+wtYod6"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="cAJbZx+Q" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id C119821108; Thu, 6 Feb 2025 05:46:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820773; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l36d111ZVdx2z5cPfSMmMWyzXjCSVO4e/vMuSUOklnE=; b=e+wtYod6c57vEBKeMZCiuh0hzJDZWynzpVjEwefJgzEun3lXN1+QaOWw7FKbVnLa3W3qE7 G2dSB6ioWv9bSar8jzG9yfi+oFrdAnLoUx1Xr0/905CTFI5lu6XBjdXq+CgwGwMmBk+/KF xWJcvG7Mx4IUTysO2wVqtZVQdOOmqKk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820773; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l36d111ZVdx2z5cPfSMmMWyzXjCSVO4e/vMuSUOklnE=; b=cAJbZx+QAPyF7cbyr5DlbKc+hDFl4PU9P00icHSm5L2eBH4VbP1g4oaMyDTI/jTAWMpr69 ScC1gVyFVbkWliDQ== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820773; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l36d111ZVdx2z5cPfSMmMWyzXjCSVO4e/vMuSUOklnE=; b=e+wtYod6c57vEBKeMZCiuh0hzJDZWynzpVjEwefJgzEun3lXN1+QaOWw7FKbVnLa3W3qE7 G2dSB6ioWv9bSar8jzG9yfi+oFrdAnLoUx1Xr0/905CTFI5lu6XBjdXq+CgwGwMmBk+/KF xWJcvG7Mx4IUTysO2wVqtZVQdOOmqKk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820773; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l36d111ZVdx2z5cPfSMmMWyzXjCSVO4e/vMuSUOklnE=; b=cAJbZx+QAPyF7cbyr5DlbKc+hDFl4PU9P00icHSm5L2eBH4VbP1g4oaMyDTI/jTAWMpr69 ScC1gVyFVbkWliDQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id B5C3213795; Thu, 6 Feb 2025 05:46:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id oD5sGqJMpGd5BwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:46:10 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 05/19] VFS: add common error checks to lookup_one_qstr() Date: Thu, 6 Feb 2025 16:42:42 +1100 Message-ID: <20250206054504.2950516-6-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Level: X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,suse.de:mid]; RCVD_TLS_ALL(0.00)[] X-Spam-Score: -2.80 X-Spam-Flag: NO Callers of lookup_one_qstr() often check if the result is negative or positive. These changes can easily be moved into lookup_one_qstr() by checking the lookup flags: LOOKUP_CREATE means it is NOT an error if the name doesn't exist. LOOKUP_EXCL means it IS an error if the name DOES exist. This patch adds these checks, then removes error checks from callers, and ensures that appropriate flags are passed. This subtly changes the meaning of LOOKUP_EXCL. Previously it could only accompany LOOKUP_CREATE. Now it can accompany LOOKUP_RENAME_TARGET as well. A couple of small changes are needed to accommodate this. The NFS is functionally a no-op but ensures nfs_is_exclusive_create() does exactly what the name says. Signed-off-by: NeilBrown --- fs/namei.c | 61 ++++++++++++++----------------------------- fs/nfs/dir.c | 3 ++- fs/smb/server/vfs.c | 26 +++++++----------- include/linux/namei.h | 2 +- 4 files changed, 33 insertions(+), 59 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 1901120bcbb8..69610047f6c6 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1668,6 +1668,8 @@ static struct dentry *lookup_dcache(const struct qstr *name, * Parent directory has inode locked: exclusive or shared. * If @flags contains any LOOKUP_INTENT_FLAGS then d_lookup_done() * must be called after the intended operation is performed - or aborted. + * Will return -ENOENT if name isn't found and LOOKUP_CREATE wasn't passed. + * Will return -EEXIST if name is found and LOOKUP_EXCL was passed. */ struct dentry *lookup_one_qstr(const struct qstr *name, struct dentry *base, @@ -1678,7 +1680,7 @@ struct dentry *lookup_one_qstr(const struct qstr *name, struct inode *dir = base->d_inode; if (dentry) - return dentry; + goto found; /* Don't create child dentry for a dead directory. */ if (unlikely(IS_DEADDIR(dir))) @@ -1689,7 +1691,7 @@ struct dentry *lookup_one_qstr(const struct qstr *name, return ERR_PTR(-ENOMEM); if (!d_in_lookup(dentry)) /* Raced with another thread which did the lookup */ - return dentry; + goto found; old = dir->i_op->lookup(dir, dentry, flags); if (unlikely(old)) { @@ -1700,6 +1702,15 @@ struct dentry *lookup_one_qstr(const struct qstr *name, if ((flags & LOOKUP_INTENT_FLAGS) == 0) /* ->lookup must have given final answer */ d_lookup_done(dentry); +found: + if (d_is_negative(dentry) && !(flags & LOOKUP_CREATE)) { + dput(dentry); + return ERR_PTR(-ENOENT); + } + if (d_is_positive(dentry) && (flags & LOOKUP_EXCL)) { + dput(dentry); + return ERR_PTR(-EEXIST); + } return dentry; } EXPORT_SYMBOL(lookup_one_qstr); @@ -2745,10 +2756,6 @@ static struct dentry *__kern_path_locked(int dfd, struct filename *name, struct } inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); d = lookup_one_qstr(&last, path->dentry, 0); - if (!IS_ERR(d) && d_is_negative(d)) { - dput(d); - d = ERR_PTR(-ENOENT); - } if (IS_ERR(d)) { inode_unlock(path->dentry->d_inode); path_put(path); @@ -4085,27 +4092,13 @@ static struct dentry *filename_create(int dfd, struct filename *name, * '/', and a directory wasn't requested. */ if (last.name[last.len] && !want_dir) - create_flags = 0; + create_flags &= ~LOOKUP_CREATE; inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); dentry = lookup_one_qstr(&last, path->dentry, reval_flag | create_flags); if (IS_ERR(dentry)) goto unlock; - error = -EEXIST; - if (d_is_positive(dentry)) - goto fail; - - /* - * Special case - lookup gave negative, but... we had foo/bar/ - * From the vfs_mknod() POV we just have a negative dentry - - * all is fine. Let's be bastards - you had / on the end, you've - * been asking for (non-existent) directory. -ENOENT for you. - */ - if (unlikely(!create_flags)) { - error = -ENOENT; - goto fail; - } if (unlikely(err2)) { error = err2; goto fail; @@ -4522,10 +4515,6 @@ int do_rmdir(int dfd, struct filename *name) error = PTR_ERR(dentry); if (IS_ERR(dentry)) goto exit3; - if (!dentry->d_inode) { - error = -ENOENT; - goto exit4; - } error = security_path_rmdir(&path, dentry); if (error) goto exit4; @@ -4656,7 +4645,7 @@ int do_unlinkat(int dfd, struct filename *name) if (!IS_ERR(dentry)) { /* Why not before? Because we want correct error value */ - if (last.name[last.len] || d_is_negative(dentry)) + if (last.name[last.len]) goto slashes; inode = dentry->d_inode; ihold(inode); @@ -4690,9 +4679,7 @@ int do_unlinkat(int dfd, struct filename *name) return error; slashes: - if (d_is_negative(dentry)) - error = -ENOENT; - else if (d_is_dir(dentry)) + if (d_is_dir(dentry)) error = -EISDIR; else error = -ENOTDIR; @@ -5192,7 +5179,8 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, struct qstr old_last, new_last; int old_type, new_type; struct inode *delegated_inode = NULL; - unsigned int lookup_flags = 0, target_flags = LOOKUP_RENAME_TARGET; + unsigned int lookup_flags = 0, target_flags = + LOOKUP_RENAME_TARGET | LOOKUP_CREATE; bool should_retry = false; int error = -EINVAL; @@ -5205,6 +5193,8 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, if (flags & RENAME_EXCHANGE) target_flags = 0; + if (flags & RENAME_NOREPLACE) + target_flags |= LOOKUP_EXCL; retry: error = filename_parentat(olddfd, from, lookup_flags, &old_path, @@ -5246,23 +5236,12 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, error = PTR_ERR(old_dentry); if (IS_ERR(old_dentry)) goto exit3; - /* source must exist */ - error = -ENOENT; - if (d_is_negative(old_dentry)) - goto exit4; new_dentry = lookup_one_qstr(&new_last, new_path.dentry, lookup_flags | target_flags); error = PTR_ERR(new_dentry); if (IS_ERR(new_dentry)) goto exit4; - error = -EEXIST; - if ((flags & RENAME_NOREPLACE) && d_is_positive(new_dentry)) - goto exit5; if (flags & RENAME_EXCHANGE) { - error = -ENOENT; - if (d_is_negative(new_dentry)) - goto exit5; - if (!d_is_dir(new_dentry)) { error = -ENOTDIR; if (new_last.name[new_last.len]) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 27c7a5c4e91b..8cbe63f4089a 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1531,7 +1531,8 @@ static int nfs_is_exclusive_create(struct inode *dir, unsigned int flags) { if (NFS_PROTO(dir)->version == 2) return 0; - return flags & LOOKUP_EXCL; + return (flags & (LOOKUP_CREATE | LOOKUP_EXCL)) == + (LOOKUP_CREATE | LOOKUP_EXCL); } /* diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c index 89b3823f6405..bf8ac43c39b0 100644 --- a/fs/smb/server/vfs.c +++ b/fs/smb/server/vfs.c @@ -113,11 +113,6 @@ static int ksmbd_vfs_path_lookup_locked(struct ksmbd_share_config *share_conf, if (IS_ERR(d)) goto err_out; - if (d_is_negative(d)) { - dput(d); - goto err_out; - } - path->dentry = d; path->mnt = mntget(parent_path->mnt); @@ -677,6 +672,7 @@ int ksmbd_vfs_rename(struct ksmbd_work *work, const struct path *old_path, struct ksmbd_file *parent_fp; int new_type; int err, lookup_flags = LOOKUP_NO_SYMLINKS; + int target_lookup_flags = LOOKUP_RENAME_TARGET; if (ksmbd_override_fsids(work)) return -ENOMEM; @@ -687,6 +683,14 @@ int ksmbd_vfs_rename(struct ksmbd_work *work, const struct path *old_path, goto revert_fsids; } + /* + * explicitly handle file overwrite case, for compatibility with + * filesystems that may not support rename flags (e.g: fuse) + */ + if (flags & RENAME_NOREPLACE) + target_lookup_flags |= LOOKUP_EXCL; + flags &= ~(RENAME_NOREPLACE); + retry: err = vfs_path_parent_lookup(to, lookup_flags | LOOKUP_BENEATH, &new_path, &new_last, &new_type, @@ -727,7 +731,7 @@ int ksmbd_vfs_rename(struct ksmbd_work *work, const struct path *old_path, } new_dentry = lookup_one_qstr(&new_last, new_path.dentry, - lookup_flags | LOOKUP_RENAME_TARGET); + lookup_flags | target_lookup_flags); if (IS_ERR(new_dentry)) { err = PTR_ERR(new_dentry); goto out3; @@ -738,16 +742,6 @@ int ksmbd_vfs_rename(struct ksmbd_work *work, const struct path *old_path, goto out4; } - /* - * explicitly handle file overwrite case, for compatibility with - * filesystems that may not support rename flags (e.g: fuse) - */ - if ((flags & RENAME_NOREPLACE) && d_is_positive(new_dentry)) { - err = -EEXIST; - goto out4; - } - flags &= ~(RENAME_NOREPLACE); - if (old_child == trap) { err = -EINVAL; goto out4; diff --git a/include/linux/namei.h b/include/linux/namei.h index 06bb3ea65beb..839a64d07f8c 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -31,7 +31,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT}; /* These tell filesystem methods that we are dealing with the final component... */ #define LOOKUP_OPEN 0x0100 /* ... in open */ #define LOOKUP_CREATE 0x0200 /* ... in object creation */ -#define LOOKUP_EXCL 0x0400 /* ... in exclusive creation */ +#define LOOKUP_EXCL 0x0400 /* ... in target must not exist */ #define LOOKUP_RENAME_TARGET 0x0800 /* ... in destination of rename() */ #define LOOKUP_INTENT_FLAGS (LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_EXCL | \ From patchwork Thu Feb 6 05:42:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962216 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D149B1B59A; Thu, 6 Feb 2025 05:46:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820787; cv=none; b=obb1TRU7689JuDtkafssLj7Y1SeX23gLwTWnYRrq1QVwaZ+C2/Pkefef/aV+Iv9AksBNeM4AIjaVxL2WLBtMiN9dBXu/AQyM6gfOyZmkH6BqT2bClpztHN3XdK9DHXeFI6lyykYMjST3RagG9HEeO81IQyzkUv+m5+OURe2ONpE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820787; c=relaxed/simple; bh=nJ7z+us0xIxt19NxCkCDEqwPLCjGDndPBQTINnvaDiw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gl0a6UAIbjRMBc7QJRYnHvwejOZNax+LWklzNtwTks7ovdlApaJfnJja93RZVazh83zN8+yVbBIMjXjvvQxS0ALH5wMDVT2rMKSfhjJb8ZzPKjAWOdNgyh7Wlxrvs1qYTwd80n7IacrQfeUnkZ9viXNa8IZGpP3nzODRQESQmnw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=HbocadT5; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=GaTpRqk0; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=HbocadT5; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=GaTpRqk0; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="HbocadT5"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="GaTpRqk0"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="HbocadT5"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="GaTpRqk0" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id F353C1F444; Thu, 6 Feb 2025 05:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820784; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vGtxcoxKGyQ37PxwNBvOA1aDF1KBl62XRF8HS/uFC+E=; b=HbocadT5k2HyJJZzgv133nHwa6ZVBHvXVpFJh4Aan9fs6VAEHWF6tHLhWqcWrZW6JetDHt hGu9PBzSCFN1p8mnmY6n2qzWrcj5jEs8HvsbpQUtQFsbRVZy0XGz3WhCyS4tTHZLZu3YoI j29A20/KemsJgqL7+XlgT3FXxR28yLw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820784; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vGtxcoxKGyQ37PxwNBvOA1aDF1KBl62XRF8HS/uFC+E=; b=GaTpRqk0UnTWiK6kwDfH8e7ccxnJmcliX/CdyCgC51EK41fLfTSqhsHYl8wpDb79VF3X57 /ilkZonxQWtlJGAw== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=HbocadT5; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=GaTpRqk0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820784; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vGtxcoxKGyQ37PxwNBvOA1aDF1KBl62XRF8HS/uFC+E=; b=HbocadT5k2HyJJZzgv133nHwa6ZVBHvXVpFJh4Aan9fs6VAEHWF6tHLhWqcWrZW6JetDHt hGu9PBzSCFN1p8mnmY6n2qzWrcj5jEs8HvsbpQUtQFsbRVZy0XGz3WhCyS4tTHZLZu3YoI j29A20/KemsJgqL7+XlgT3FXxR28yLw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820784; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vGtxcoxKGyQ37PxwNBvOA1aDF1KBl62XRF8HS/uFC+E=; b=GaTpRqk0UnTWiK6kwDfH8e7ccxnJmcliX/CdyCgC51EK41fLfTSqhsHYl8wpDb79VF3X57 /ilkZonxQWtlJGAw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 3F39413795; Thu, 6 Feb 2025 05:46:20 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id rzwYOaxMpGeABwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:46:20 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 06/19] VFS: repack DENTRY_ flags. Date: Thu, 6 Feb 2025 16:42:43 +1100 Message-ID: <20250206054504.2950516-7-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: F353C1F444 X-Spam-Level: X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,suse.de:dkim,suse.de:mid]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -3.01 X-Spam-Flag: NO Bits 13, 23, 24, and 27 are not used. Move all those holes to the end. Signed-off-by: NeilBrown --- include/linux/dcache.h | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/include/linux/dcache.h b/include/linux/dcache.h index b03cbb0177a3..d5816cf19538 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -203,34 +203,34 @@ struct dentry_operations { #define DCACHE_NFSFS_RENAMED BIT(12) /* this dentry has been "silly renamed" and has to be deleted on the last * dput() */ -#define DCACHE_FSNOTIFY_PARENT_WATCHED BIT(14) +#define DCACHE_FSNOTIFY_PARENT_WATCHED BIT(13) /* Parent inode is watched by some fsnotify listener */ -#define DCACHE_DENTRY_KILLED BIT(15) +#define DCACHE_DENTRY_KILLED BIT(14) -#define DCACHE_MOUNTED BIT(16) /* is a mountpoint */ -#define DCACHE_NEED_AUTOMOUNT BIT(17) /* handle automount on this dir */ -#define DCACHE_MANAGE_TRANSIT BIT(18) /* manage transit from this dirent */ +#define DCACHE_MOUNTED BIT(15) /* is a mountpoint */ +#define DCACHE_NEED_AUTOMOUNT BIT(16) /* handle automount on this dir */ +#define DCACHE_MANAGE_TRANSIT BIT(17) /* manage transit from this dirent */ #define DCACHE_MANAGED_DENTRY \ (DCACHE_MOUNTED|DCACHE_NEED_AUTOMOUNT|DCACHE_MANAGE_TRANSIT) -#define DCACHE_LRU_LIST BIT(19) +#define DCACHE_LRU_LIST BIT(18) -#define DCACHE_ENTRY_TYPE (7 << 20) /* bits 20..22 are for storing type: */ -#define DCACHE_MISS_TYPE (0 << 20) /* Negative dentry */ -#define DCACHE_WHITEOUT_TYPE (1 << 20) /* Whiteout dentry (stop pathwalk) */ -#define DCACHE_DIRECTORY_TYPE (2 << 20) /* Normal directory */ -#define DCACHE_AUTODIR_TYPE (3 << 20) /* Lookupless directory (presumed automount) */ -#define DCACHE_REGULAR_TYPE (4 << 20) /* Regular file type */ -#define DCACHE_SPECIAL_TYPE (5 << 20) /* Other file type */ -#define DCACHE_SYMLINK_TYPE (6 << 20) /* Symlink */ +#define DCACHE_ENTRY_TYPE (7 << 19) /* bits 19..21 are for storing type: */ +#define DCACHE_MISS_TYPE (0 << 19) /* Negative dentry */ +#define DCACHE_WHITEOUT_TYPE (1 << 19) /* Whiteout dentry (stop pathwalk) */ +#define DCACHE_DIRECTORY_TYPE (2 << 19) /* Normal directory */ +#define DCACHE_AUTODIR_TYPE (3 << 19) /* Lookupless directory (presumed automount) */ +#define DCACHE_REGULAR_TYPE (4 << 19) /* Regular file type */ +#define DCACHE_SPECIAL_TYPE (5 << 19) /* Other file type */ +#define DCACHE_SYMLINK_TYPE (6 << 19) /* Symlink */ -#define DCACHE_NOKEY_NAME BIT(25) /* Encrypted name encoded without key */ -#define DCACHE_OP_REAL BIT(26) +#define DCACHE_NOKEY_NAME BIT(22) /* Encrypted name encoded without key */ +#define DCACHE_OP_REAL BIT(23) -#define DCACHE_PAR_LOOKUP BIT(28) /* being looked up (with parent locked shared) */ -#define DCACHE_DENTRY_CURSOR BIT(29) -#define DCACHE_NORCU BIT(30) /* No RCU delay for freeing */ +#define DCACHE_PAR_LOOKUP BIT(24) /* being looked up (with parent locked shared) */ +#define DCACHE_DENTRY_CURSOR BIT(25) +#define DCACHE_NORCU BIT(26) /* No RCU delay for freeing */ extern seqlock_t rename_lock; From patchwork Thu Feb 6 05:42:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962217 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDDEB22489C; Thu, 6 Feb 2025 05:46:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820797; cv=none; b=brCh9UGVoSRf2VJ2PyhCzeXbMCWfUBM98sJna7GiLyT2ufI5p9y7/uksZq9cnIHX3l/R8wlFY3tZJ95TYLa6iOj+MphJhQ0ktYGpDRcBtc/ezJzfeZZurdnBY4FVPHCreFJwfIcgqhwg6dqusyeIGYr654iiupHMxYQoKT4Zb+w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820797; c=relaxed/simple; bh=t0MO4Xthm06JoD2KBBBPHCuqBY6I1wuhuOK+pFBBIgM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XS+PMkLw+RDsX3z4AZ7gKONDYEgKdo/FF85fPRqiWvf0bCKPLTJqA25du3Qp0SOI0S/s5CP5HRKVXM746HpSxKfS+rAZzLgakO4UXiUtYax6yVb1z10be9dSOpPs680+eSRo+DefiVr+p/9HtgnhTKrBIg6SAdNiGrUNhZxluKo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=BdXSljiK; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=gy7nnXnH; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=BdXSljiK; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=gy7nnXnH; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="BdXSljiK"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="gy7nnXnH"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="BdXSljiK"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="gy7nnXnH" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 447BD1F381; Thu, 6 Feb 2025 05:46:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820794; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pDLQ5/tKoGgGRjuqm2ZNWoZwMtK0kQNV8iRObc8Hijo=; b=BdXSljiKOgP7NP72GsFD0oeKHdyJrVHK9Pf+AXQ8ccALMbG+vXGQefFj+m98ar/wamDvdj TUhhsjvogWYz4VD4Z3OCSdjEKL9ZVjF7eAPIV34xd1nx1wQIwVKB/Lf1MwFkaDLMb+C2rP lazDmaa+Kat2uf9wdBKHqZXTK1dPf40= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820794; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pDLQ5/tKoGgGRjuqm2ZNWoZwMtK0kQNV8iRObc8Hijo=; b=gy7nnXnHEgQZMRLk+HBI2p5zB65oV3Nc6Z++Q2BQ5uCDEB3ZVfFf36Ux5Bv9SIXfuR787v 3vqdwPegqMgw5tBQ== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820794; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pDLQ5/tKoGgGRjuqm2ZNWoZwMtK0kQNV8iRObc8Hijo=; b=BdXSljiKOgP7NP72GsFD0oeKHdyJrVHK9Pf+AXQ8ccALMbG+vXGQefFj+m98ar/wamDvdj TUhhsjvogWYz4VD4Z3OCSdjEKL9ZVjF7eAPIV34xd1nx1wQIwVKB/Lf1MwFkaDLMb+C2rP lazDmaa+Kat2uf9wdBKHqZXTK1dPf40= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820794; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pDLQ5/tKoGgGRjuqm2ZNWoZwMtK0kQNV8iRObc8Hijo=; b=gy7nnXnHEgQZMRLk+HBI2p5zB65oV3Nc6Z++Q2BQ5uCDEB3ZVfFf36Ux5Bv9SIXfuR787v 3vqdwPegqMgw5tBQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 7D2B313795; Thu, 6 Feb 2025 05:46:31 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id q6EYDLdMpGeJBwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:46:31 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 07/19] VFS: repack LOOKUP_ bit flags. Date: Thu, 6 Feb 2025 16:42:44 +1100 Message-ID: <20250206054504.2950516-8-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Level: X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; ARC_NA(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; FUZZY_BLOCKED(0.00)[rspamd.com]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; TO_DN_SOME(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,suse.de:mid] X-Spam-Score: -2.80 X-Spam-Flag: NO The LOOKUP_ bits are not in order, which can make it awkward when adding new bits. Two bits have recently been added to the end which makes them look like "scoping flags", but in fact they aren't. Also LOOKUP_PARENT is described as "internal use only" but is used in fs/nfs/ This patch: - Moves these three flags into the "pathwalk mode" section - changes all bits to use the BIT(n) macro - Allocates bits in order leaving gaps between the sections, and documents those gaps. Signed-off-by: NeilBrown --- include/linux/namei.h | 46 +++++++++++++++++++++---------------------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/include/linux/namei.h b/include/linux/namei.h index 839a64d07f8c..0d81e571a159 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -18,38 +18,38 @@ enum { MAX_NESTED_LINKS = 8 }; enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT}; /* pathwalk mode */ -#define LOOKUP_FOLLOW 0x0001 /* follow links at the end */ -#define LOOKUP_DIRECTORY 0x0002 /* require a directory */ -#define LOOKUP_AUTOMOUNT 0x0004 /* force terminal automount */ -#define LOOKUP_EMPTY 0x4000 /* accept empty path [user_... only] */ -#define LOOKUP_DOWN 0x8000 /* follow mounts in the starting point */ -#define LOOKUP_MOUNTPOINT 0x0080 /* follow mounts in the end */ - -#define LOOKUP_REVAL 0x0020 /* tell ->d_revalidate() to trust no cache */ -#define LOOKUP_RCU 0x0040 /* RCU pathwalk mode; semi-internal */ +#define LOOKUP_FOLLOW BIT(0) /* follow links at the end */ +#define LOOKUP_DIRECTORY BIT(1) /* require a directory */ +#define LOOKUP_AUTOMOUNT BIT(2) /* force terminal automount */ +#define LOOKUP_EMPTY BIT(3) /* accept empty path [user_... only] */ +#define LOOKUP_LINKAT_EMPTY BIT(4) /* Linkat request with empty path. */ +#define LOOKUP_DOWN BIT(5) /* follow mounts in the starting point */ +#define LOOKUP_MOUNTPOINT BIT(6) /* follow mounts in the end */ +#define LOOKUP_REVAL BIT(7) /* tell ->d_revalidate() to trust no cache */ +#define LOOKUP_RCU BIT(8) /* RCU pathwalk mode; semi-internal */ +#define LOOKUP_CACHED BIT(9) /* Only do cached lookup */ +#define LOOKUP_PARENT BIT(10) /* Looking up final parent in path */ +/* 5 spare bits for pathwalk */ /* These tell filesystem methods that we are dealing with the final component... */ -#define LOOKUP_OPEN 0x0100 /* ... in open */ -#define LOOKUP_CREATE 0x0200 /* ... in object creation */ -#define LOOKUP_EXCL 0x0400 /* ... in target must not exist */ -#define LOOKUP_RENAME_TARGET 0x0800 /* ... in destination of rename() */ +#define LOOKUP_OPEN BIT(16) /* ... in open */ +#define LOOKUP_CREATE BIT(17) /* ... in object creation */ +#define LOOKUP_EXCL BIT(18) /* ... in target must not exist */ +#define LOOKUP_RENAME_TARGET BIT(19) /* ... in destination of rename() */ #define LOOKUP_INTENT_FLAGS (LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_EXCL | \ LOOKUP_RENAME_TARGET) - -/* internal use only */ -#define LOOKUP_PARENT 0x0010 +/* 4 spare bits for intent */ /* Scoping flags for lookup. */ -#define LOOKUP_NO_SYMLINKS 0x010000 /* No symlink crossing. */ -#define LOOKUP_NO_MAGICLINKS 0x020000 /* No nd_jump_link() crossing. */ -#define LOOKUP_NO_XDEV 0x040000 /* No mountpoint crossing. */ -#define LOOKUP_BENEATH 0x080000 /* No escaping from starting point. */ -#define LOOKUP_IN_ROOT 0x100000 /* Treat dirfd as fs root. */ -#define LOOKUP_CACHED 0x200000 /* Only do cached lookup */ -#define LOOKUP_LINKAT_EMPTY 0x400000 /* Linkat request with empty path. */ +#define LOOKUP_NO_SYMLINKS BIT(24) /* No symlink crossing. */ +#define LOOKUP_NO_MAGICLINKS BIT(25) /* No nd_jump_link() crossing. */ +#define LOOKUP_NO_XDEV BIT(26) /* No mountpoint crossing. */ +#define LOOKUP_BENEATH BIT(27) /* No escaping from starting point. */ +#define LOOKUP_IN_ROOT BIT(28) /* Treat dirfd as fs root. */ /* LOOKUP_* flags which do scope-related checks based on the dirfd. */ #define LOOKUP_IS_SCOPED (LOOKUP_BENEATH | LOOKUP_IN_ROOT) +/* 3 spare bits for scoping */ extern int path_pts(struct path *path); From patchwork Thu Feb 6 05:42:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962248 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 254392248AC; Thu, 6 Feb 2025 05:46:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820812; cv=none; b=bgyp55/+jxGH2de0NgngzFM2/Sb+vK9fKVduvZyzDNWYDxH8eO+JMOyZ8Yo0wcI3RfrrC+WEifeyZZLQM6VyVugg+V8SJ2PXXsYNmSxtrFUPe8o8MESGkhreFvkoLUbJ4J3l3+fO2af/vtWU81Zs+01fXcLawnrehyd3xUQxb5E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820812; c=relaxed/simple; bh=iL/nhqws/lXvFKaqQy7FzZc+bidN+UBJRyrqXGrYlDM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uJT0OmJrSvTjhnlQnoFmOkPkSLthbwPkUJDIRNNyFyX+pmcfpyt7v/RDk3VDE6Ddx6VlLnbktcZ3ZdFcyiQ/nYXT59EmBc8UV+CQaQRnk7FjmQsJ3MSspg0wOjer7zF4qoeCPTQE4TLeNw+oBKhFVfRa3N0eGJjd5T0ehbqEgLU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=UF43y6Eb; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=a4pKuja8; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=UF43y6Eb; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=a4pKuja8; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="UF43y6Eb"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="a4pKuja8"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="UF43y6Eb"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="a4pKuja8" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 841401F381; Thu, 6 Feb 2025 05:46:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820808; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O8L/HH4OAeWFVdkvYr7ks2OhMu7jHsU5kLHRPnlUFHo=; b=UF43y6EbY62n9ITb24RVzoJ4h4g7dwHffT9jcyquJDBzHcKUgeqrTSbUXcswbDWcA8iSkU Rss+IEy2dViPdQzYRlMiiOiZ+Rdb23PKUfZ8dXuHhaa1qF1DIAYjpFXYmc/KV0Uqpabpae mWEyAlmMKWzTU7yIhV7dnjK4/8HKuto= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820808; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O8L/HH4OAeWFVdkvYr7ks2OhMu7jHsU5kLHRPnlUFHo=; b=a4pKuja8eTeUZ/qLDVJHZyceidRG4S1uOx0yYoh14Jn2JtaagPn+ZS8x4R9qT7JCX3PaW9 vxomYNQeR7MoNGDw== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820808; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O8L/HH4OAeWFVdkvYr7ks2OhMu7jHsU5kLHRPnlUFHo=; b=UF43y6EbY62n9ITb24RVzoJ4h4g7dwHffT9jcyquJDBzHcKUgeqrTSbUXcswbDWcA8iSkU Rss+IEy2dViPdQzYRlMiiOiZ+Rdb23PKUfZ8dXuHhaa1qF1DIAYjpFXYmc/KV0Uqpabpae mWEyAlmMKWzTU7yIhV7dnjK4/8HKuto= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820808; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O8L/HH4OAeWFVdkvYr7ks2OhMu7jHsU5kLHRPnlUFHo=; b=a4pKuja8eTeUZ/qLDVJHZyceidRG4S1uOx0yYoh14Jn2JtaagPn+ZS8x4R9qT7JCX3PaW9 vxomYNQeR7MoNGDw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id C202413795; Thu, 6 Feb 2025 05:46:45 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id NCEEHcVMpGehBwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:46:45 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 08/19] VFS: introduce lookup_and_lock() and friends Date: Thu, 6 Feb 2025 16:42:45 +1100 Message-ID: <20250206054504.2950516-9-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Level: lookup_and_lock() combines locking the directory and performing a lookup prior to a change to the directory. Abstracting this prepares for changing the locking requirements. done_lookup_and_lock() provides the inverse of putting the dentry and unlocking. For "silly_rename" we will need to lookup_and_lock() in a directory that is already locked. For this purpose we add LOOKUP_PARENT_LOCKED. Like lookup_len_qstr(), lookup_and_lock() returns -ENOENT if LOOKUP_CREATE was NOT given and the name cannot be found,, and returns -EEXIST if LOOKUP_EXCL WAS given and the name CAN be found. These functions replace all uses of lookup_one_qstr() in namei.c except for those used for rename. The name might seem backwards as the lock happens before the lookup. A future patch will change this so that only a shared lock is taken before the lookup, and an exclusive lock on the dentry is taken after a successful lookup. So the order "lookup" then "lock" will make sense. This functionality is exported as lookup_and_lock_one() which takes a name and len rather than a qstr. Signed-off-by: NeilBrown --- fs/namei.c | 102 ++++++++++++++++++++++++++++-------------- include/linux/namei.h | 15 ++++++- 2 files changed, 83 insertions(+), 34 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 69610047f6c6..3c0feca081a2 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1715,6 +1715,41 @@ struct dentry *lookup_one_qstr(const struct qstr *name, } EXPORT_SYMBOL(lookup_one_qstr); +static struct dentry *lookup_and_lock_nested(const struct qstr *last, + struct dentry *base, + unsigned int lookup_flags, + unsigned int subclass) +{ + struct dentry *dentry; + + if (!(lookup_flags & LOOKUP_PARENT_LOCKED)) + inode_lock_nested(base->d_inode, subclass); + + dentry = lookup_one_qstr(last, base, lookup_flags); + if (IS_ERR(dentry) && !(lookup_flags & LOOKUP_PARENT_LOCKED)) { + inode_unlock(base->d_inode); + } + return dentry; +} + +static struct dentry *lookup_and_lock(const struct qstr *last, + struct dentry *base, + unsigned int lookup_flags) +{ + return lookup_and_lock_nested(last, base, lookup_flags, + I_MUTEX_PARENT); +} + +void done_lookup_and_lock(struct dentry *base, struct dentry *dentry, + unsigned int lookup_flags) +{ + d_lookup_done(dentry); + dput(dentry); + if (!(lookup_flags & LOOKUP_PARENT_LOCKED)) + inode_unlock(base->d_inode); +} +EXPORT_SYMBOL(done_lookup_and_lock); + /** * lookup_fast - do fast lockless (but racy) lookup of a dentry * @nd: current nameidata @@ -2754,12 +2789,9 @@ static struct dentry *__kern_path_locked(int dfd, struct filename *name, struct path_put(path); return ERR_PTR(-EINVAL); } - inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); - d = lookup_one_qstr(&last, path->dentry, 0); - if (IS_ERR(d)) { - inode_unlock(path->dentry->d_inode); + d = lookup_and_lock(&last, path->dentry, 0); + if (IS_ERR(d)) path_put(path); - } return d; } @@ -3053,6 +3085,22 @@ struct dentry *lookup_positive_unlocked(const char *name, } EXPORT_SYMBOL(lookup_positive_unlocked); +struct dentry *lookup_and_lock_one(struct mnt_idmap *idmap, + const char *name, int len, struct dentry *base, + unsigned int lookup_flags) +{ + struct qstr this; + int err; + + if (!idmap) + idmap = &nop_mnt_idmap; + err = lookup_one_common(idmap, name, base, len, &this); + if (err) + return ERR_PTR(err); + return lookup_and_lock(&this, base, lookup_flags); +} +EXPORT_SYMBOL(lookup_and_lock_one); + #ifdef CONFIG_UNIX98_PTYS int path_pts(struct path *path) { @@ -4071,7 +4119,6 @@ static struct dentry *filename_create(int dfd, struct filename *name, unsigned int reval_flag = lookup_flags & LOOKUP_REVAL; unsigned int create_flags = LOOKUP_CREATE | LOOKUP_EXCL; int type; - int err2; int error; error = filename_parentat(dfd, name, reval_flag, path, &last, &type); @@ -4083,36 +4130,30 @@ static struct dentry *filename_create(int dfd, struct filename *name, * (foo/., foo/.., /////) */ if (unlikely(type != LAST_NORM)) - goto out; + goto put; /* don't fail immediately if it's r/o, at least try to report other errors */ - err2 = mnt_want_write(path->mnt); + error = mnt_want_write(path->mnt); /* * Do the final lookup. Suppress 'create' if there is a trailing * '/', and a directory wasn't requested. */ if (last.name[last.len] && !want_dir) create_flags &= ~LOOKUP_CREATE; - inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); - dentry = lookup_one_qstr(&last, path->dentry, - reval_flag | create_flags); + dentry = lookup_and_lock(&last, path->dentry, reval_flag | create_flags); if (IS_ERR(dentry)) - goto unlock; + goto drop; - if (unlikely(err2)) { - error = err2; + if (unlikely(error)) goto fail; - } return dentry; fail: - d_lookup_done(dentry); - dput(dentry); + done_lookup_and_lock(path->dentry, dentry, reval_flag | create_flags); dentry = ERR_PTR(error); -unlock: - inode_unlock(path->dentry->d_inode); - if (!err2) +drop: + if (!error) mnt_drop_write(path->mnt); -out: +put: path_put(path); return dentry; } @@ -4130,14 +4171,13 @@ EXPORT_SYMBOL(kern_path_create); void done_path_create(struct path *path, struct dentry *dentry) { - dput(dentry); - inode_unlock(path->dentry->d_inode); + done_lookup_and_lock(path->dentry, dentry, LOOKUP_CREATE); mnt_drop_write(path->mnt); path_put(path); } EXPORT_SYMBOL(done_path_create); -inline struct dentry *user_path_create(int dfd, const char __user *pathname, +struct dentry *user_path_create(int dfd, const char __user *pathname, struct path *path, unsigned int lookup_flags) { struct filename *filename = getname(pathname); @@ -4510,19 +4550,18 @@ int do_rmdir(int dfd, struct filename *name) if (error) goto exit2; - inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT); - dentry = lookup_one_qstr(&last, path.dentry, lookup_flags); + dentry = lookup_and_lock(&last, path.dentry, lookup_flags); error = PTR_ERR(dentry); if (IS_ERR(dentry)) goto exit3; + error = security_path_rmdir(&path, dentry); if (error) goto exit4; error = vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode, dentry); exit4: - dput(dentry); + done_lookup_and_lock(path.dentry, dentry, lookup_flags); exit3: - inode_unlock(path.dentry->d_inode); mnt_drop_write(path.mnt); exit2: path_put(&path); @@ -4639,11 +4678,9 @@ int do_unlinkat(int dfd, struct filename *name) if (error) goto exit2; retry_deleg: - inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT); - dentry = lookup_one_qstr(&last, path.dentry, lookup_flags); + dentry = lookup_and_lock(&last, path.dentry, lookup_flags); error = PTR_ERR(dentry); if (!IS_ERR(dentry)) { - /* Why not before? Because we want correct error value */ if (last.name[last.len]) goto slashes; @@ -4655,9 +4692,8 @@ int do_unlinkat(int dfd, struct filename *name) error = vfs_unlink(mnt_idmap(path.mnt), path.dentry->d_inode, dentry, &delegated_inode); exit3: - dput(dentry); + done_lookup_and_lock(path.dentry, dentry, lookup_flags); } - inode_unlock(path.dentry->d_inode); if (inode) iput(inode); /* truncate the inode here */ inode = NULL; diff --git a/include/linux/namei.h b/include/linux/namei.h index 0d81e571a159..76c587a5ec3a 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -29,7 +29,11 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT}; #define LOOKUP_RCU BIT(8) /* RCU pathwalk mode; semi-internal */ #define LOOKUP_CACHED BIT(9) /* Only do cached lookup */ #define LOOKUP_PARENT BIT(10) /* Looking up final parent in path */ -/* 5 spare bits for pathwalk */ +#define LOOKUP_PARENT_LOCKED BIT(11) /* filesystem sets this for nested + * "lookup_and_lock_one" when it knows + * parent is sufficiently locked. + */ +/* 4 spare bits for pathwalk */ /* These tell filesystem methods that we are dealing with the final component... */ #define LOOKUP_OPEN BIT(16) /* ... in open */ @@ -82,6 +86,15 @@ struct dentry *lookup_one_unlocked(struct mnt_idmap *idmap, struct dentry *lookup_one_positive_unlocked(struct mnt_idmap *idmap, const char *name, struct dentry *base, int len); +struct dentry *lookup_and_lock_one(struct mnt_idmap *idmap, + const char *name, int len, struct dentry *base, + unsigned int lookup_flags); +struct dentry *__lookup_and_lock_one(struct mnt_idmap *idmap, + const char *name, int len, struct dentry *base, + unsigned int lookup_flags); +void done_lookup_and_lock(struct dentry *base, struct dentry *dentry, + unsigned int lookup_flags); +void __done_lookup_and_lock(struct dentry *dentry); extern int follow_down_one(struct path *); extern int follow_down(struct path *path, unsigned int flags); From patchwork Thu Feb 6 05:42:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962249 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 04B58223331; Thu, 6 Feb 2025 05:47:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820823; cv=none; b=phv035SkGYRWeQRh0Uob+kIsj9Y2DNEdSZhxct2W0nP1OBUphBJfLfpQN2iHbEGQ0kLqFa8ZEEZhvMcpbD26wIXmf2+6XsOI29Ge6MtForHmWkYBHCFFZT6u24GSkdTkSIfbDz28DJ605bvbjj5RowurNRf7Js6JeHi0r9T1qaQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820823; c=relaxed/simple; bh=gdygCKhQb3w+GBt9jJ3gW/QJifvNA5+bTj1JxdwXE34=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BVzooqB8XWHjVLil9FZuSEXA8mzPFlIyRQXOWqwBKn0BjKzuN660oGzWqRMkSOHdwgoRZO0wfNxOfh3PTDYYIsKYirLGpKA0lyfSYZYIuUctELLcDnRlcDDn4ysknD5JZTGe8SDZTYMh3icTzSn5/F3XhkGwnjZE1tIFj0jzUwo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=RdQGqYED; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=QbUcTe69; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=RdQGqYED; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=QbUcTe69; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="RdQGqYED"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="QbUcTe69"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="RdQGqYED"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="QbUcTe69" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 2B0CF1F381; Thu, 6 Feb 2025 05:46:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820819; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s2GgcB4Hja+gQ+2IbaxLf250Plk+EJ0g+L9va3tJXtU=; b=RdQGqYEDflycplhIaeHPlQMDUykSVi0uNKxkU399iJIl52AgPGAuTH2XafzjpIYjFimAws s35kX2hVuVl1NkHENbUAgTfmMWjad+ntiWaWUuOwFFM661XiN8SoQuWOOFwhYmtkD+fVr+ NAlIV1uGFtnxocBbm4OC/mdfnPLpQuk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820819; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s2GgcB4Hja+gQ+2IbaxLf250Plk+EJ0g+L9va3tJXtU=; b=QbUcTe690ZJndHbysgkWcAPvNfbtTciBxd0AZcCvCouzbhMDagiAl+2hhmjHk+SAqEJWF2 FCDo0EQ5OeyOL7Bg== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=RdQGqYED; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=QbUcTe69 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820819; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s2GgcB4Hja+gQ+2IbaxLf250Plk+EJ0g+L9va3tJXtU=; b=RdQGqYEDflycplhIaeHPlQMDUykSVi0uNKxkU399iJIl52AgPGAuTH2XafzjpIYjFimAws s35kX2hVuVl1NkHENbUAgTfmMWjad+ntiWaWUuOwFFM661XiN8SoQuWOOFwhYmtkD+fVr+ NAlIV1uGFtnxocBbm4OC/mdfnPLpQuk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820819; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s2GgcB4Hja+gQ+2IbaxLf250Plk+EJ0g+L9va3tJXtU=; b=QbUcTe690ZJndHbysgkWcAPvNfbtTciBxd0AZcCvCouzbhMDagiAl+2hhmjHk+SAqEJWF2 FCDo0EQ5OeyOL7Bg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 2053F13795; Thu, 6 Feb 2025 05:46:55 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id Y2JQMc9MpGevBwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:46:55 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 09/19] VFS: add _async versions of the various directory modifying inode_operations Date: Thu, 6 Feb 2025 16:42:46 +1100 Message-ID: <20250206054504.2950516-10-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 2B0CF1F381 X-Spam-Level: X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,suse.de:dkim,suse.de:mid] X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -3.01 X-Spam-Flag: NO These "_async" versions of various inode operations are only guaranteed a shared lock on the directory but if the directory isn't exclusively locked then they are guaranteed an exclusive lock on the dentry within the directory (which will be implemented in a later patch). This will allow a graceful transition from exclusive to shared locking for directory updates, and even to async updates which can complete with no lock on the directory - only on the dentry. mkdir_async is a bit different as it optionally returns a new dentry for cases when the filesystem is not able to use the original dentry. This allows vfs_mkdir_return() to avoid the need for an extra lookup. Signed-off-by: NeilBrown --- Documentation/filesystems/locking.rst | 51 ++++++++- Documentation/filesystems/porting.rst | 10 ++ Documentation/filesystems/vfs.rst | 24 +++++ fs/namei.c | 142 +++++++++++++++++++++----- include/linux/fs.h | 24 +++++ 5 files changed, 223 insertions(+), 28 deletions(-) diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index d20a32b77b60..adeead366332 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -62,15 +62,24 @@ inode_operations prototypes:: int (*create) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t, bool); + int (*create_async) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t, bool, struct dirop_ret *); struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int); int (*link) (struct dentry *,struct inode *,struct dentry *); + int (*link_async) (struct dentry *,struct inode *,struct dentry *, struct dirop_ret *); int (*unlink) (struct inode *,struct dentry *); + int (*unlink_async) (struct inode *,struct dentry *, struct dirop_ret *); int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *,const char *); + int (*symlink_async) (struct mnt_idmap *, struct inode *,struct dentry *,const char *m , struct dirop_ret *); int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); + struct dentry * (*mkdir_async) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t, struct dirop_ret *); int (*rmdir) (struct inode *,struct dentry *); + int (*rmdir_async) (struct inode *,struct dentry *, struct dirop_ret *); int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t); + int (*mknod_async) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t, struct dirop_ret *); int (*rename) (struct mnt_idmap *, struct inode *, struct dentry *, struct inode *, struct dentry *, unsigned int); + int (*rename_async) (struct mnt_idmap *, struct inode *, struct dentry *, + struct inode *, struct dentry *, unsigned int, struct dirop_ret *); int (*readlink) (struct dentry *, char __user *,int); const char *(*get_link) (struct dentry *, struct inode *, struct delayed_call *); void (*truncate) (struct inode *); @@ -84,6 +93,9 @@ prototypes:: int (*atomic_open)(struct inode *, struct dentry *, struct file *, unsigned open_flag, umode_t create_mode); + int (*atomic_open_async)(struct inode *, struct dentry *, + struct file *, unsigned open_flag, + umode_t create_mode, struct dirop_ret *); int (*tmpfile) (struct mnt_idmap *, struct inode *, struct file *, umode_t); int (*fileattr_set)(struct mnt_idmap *idmap, @@ -95,18 +107,33 @@ prototypes:: locking rules: all may block +All directory-modifying operations are called with an exclusive lock on +the target dentry or dentries using DCACHE_PAR_LOOKUP. This allows the +shared lock on i_rwsem for the _async ops to be safe. The lock on +i_rwsem may be dropped as soon as the op returns, though if it returns +-EINPROGRESS the lock using DCACHE_PAR_UPDATE will not be dropped until +the callback is called. + ============== ================================================== ops i_rwsem(inode) ============== ================================================== lookup: shared create: exclusive +create_async: shared link: exclusive (both) +link_async: exclusive on source, shared on target mknod: exclusive +mknod_async: shared symlink: exclusive +symlink_async: shared mkdir: exclusive +mkdir_async: shared unlink: exclusive (both) +unlink_async: exclusive on object, shared on directory/name rmdir: exclusive (both)(see below) +rmdir_async: exclusive on object, shared on directory/name (see below) rename: exclusive (both parents, some children) (see below) +rename_async: shared (both parents) exclusive (some children) (see below) readlink: no get_link: no setattr: exclusive @@ -118,6 +145,7 @@ listxattr: no fiemap: no update_time: no atomic_open: shared (exclusive if O_CREAT is set in open flags) +atomic_open_async: shared (if O_CREAT is not set, then may not have exclusive lock on name) tmpfile: no fileattr_get: no or exclusive fileattr_set: exclusive @@ -125,8 +153,10 @@ get_offset_ctx no ============== ================================================== - Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_rwsem - exclusive on victim. + Additionally, ->rmdir(), ->unlink() and ->rename(), as well as _async + versions, have ->i_rwsem exclusive on victim. This exclusive lock + may be dropped when the op completes even if the async operation is + continuing. cross-directory ->rename() has (per-superblock) ->s_vfs_rename_sem. ->unlink() and ->rename() have ->i_rwsem exclusive on all non-directories involved. @@ -135,6 +165,23 @@ get_offset_ctx no See Documentation/filesystems/directory-locking.rst for more detailed discussion of the locking scheme for directory operations. +The _async operations will be passed a (non-NULL) struct dirop_ret pointer:: + + struct dirop_ret { + union { + int err; + struct dentry *dentry; + }; + void (*done_cb)(struct dirop_ret*); + }; + +They may return -EINPROGRESS (or ERR_PTR(-EINPROGRESS)) in which case +the op will continue asynchronously. When it completes the result, +which must NOT be -EINPROGRESS, is stored in err or dentry (as +appropriate) and the done_cb() function is called. Callers can only +make use of the asynchrony when they determine that no lock need be held +on i_rwsem. + xattr_handler operations ======================== diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst index 1639e78e3146..a736c9f30d9d 100644 --- a/Documentation/filesystems/porting.rst +++ b/Documentation/filesystems/porting.rst @@ -1157,3 +1157,13 @@ in normal case it points into the pathname being looked up. NOTE: if you need something like full path from the root of filesystem, you are still on your own - this assists with simple cases, but it's not magic. + +--- + +**recommended** + +create_async, link_async, unlink_async, rmdir_async, mknod_async, +rename_async, atomic_open_async can be provided instead of the +corresponding inode_operations with the "_async" suffix. Multiple +_async operations can be performed in a given directory concurrently, +but never on the same name. diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 31eea688609a..e18655054e6c 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -491,15 +491,24 @@ As of kernel 2.6.22, the following members are defined: struct inode_operations { int (*create) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t, bool); + int (*create_async) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t, bool, struct dirop_ret *); struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int); int (*link) (struct dentry *,struct inode *,struct dentry *); + int (*link_async) (struct dentry *,struct inode *,struct dentry *, struct dirop_ret *); int (*unlink) (struct inode *,struct dentry *); + int (*unlink_async) (struct inode *,struct dentry *, struct dirop_ret *); int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *,const char *); + int (*symlink_async) (struct mnt_idmap *, struct inode *,struct dentry *,const char *, struct dirop_ret *); int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); + struct dentry * (*mkdir_async) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t, struct dirop_ret *); int (*rmdir) (struct inode *,struct dentry *); + int (*rmdir_async) (struct inode *,struct dentry *, struct dirop_ret *); int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t); + int (*mknod_async) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t, struct dirop_ret *); int (*rename) (struct mnt_idmap *, struct inode *, struct dentry *, struct inode *, struct dentry *, unsigned int); + int (*rename_async) (struct mnt_idmap *, struct inode *, struct dentry *, + struct inode *, struct dentry *, unsigned int, struct dirop_ret *); int (*readlink) (struct dentry *, char __user *,int); const char *(*get_link) (struct dentry *, struct inode *, struct delayed_call *); @@ -511,6 +520,8 @@ As of kernel 2.6.22, the following members are defined: void (*update_time)(struct inode *, struct timespec *, int); int (*atomic_open)(struct inode *, struct dentry *, struct file *, unsigned open_flag, umode_t create_mode); + int (*atomic_open_async)(struct inode *, struct dentry *, struct file *, + unsigned open_flag, umode_t create_mode, struct dirop_ret *); int (*tmpfile) (struct mnt_idmap *, struct inode *, struct file *, umode_t); struct posix_acl * (*get_acl)(struct mnt_idmap *, struct dentry *, int); int (*set_acl)(struct mnt_idmap *, struct dentry *, struct posix_acl *, int); @@ -524,6 +535,7 @@ Again, all methods are called without any locks being held, unless otherwise noted. ``create`` +``create_async`` called by the open(2) and creat(2) system calls. Only required if you want to support regular files. The dentry you get should not have an inode (i.e. it should be a negative dentry). Here @@ -546,29 +558,39 @@ otherwise noted. directory inode semaphore held ``link`` +``link_async`` called by the link(2) system call. Only required if you want to support hard links. You will probably need to call d_instantiate() just as you would in the create() method ``unlink`` +``unlink_async`` called by the unlink(2) system call. Only required if you want to support deleting inodes ``symlink`` +``symlink_async`` called by the symlink(2) system call. Only required if you want to support symlinks. You will probably need to call d_instantiate() just as you would in the create() method ``mkdir`` +``mkdir_async`` called by the mkdir(2) system call. Only required if you want to support creating subdirectories. You will probably need to call d_instantiate() just as you would in the create() method + mkdir_async can return an alternate dentry, much like lookup. + In this case the original dentry will still be negative and will + be unhashed. + ``rmdir`` +``rmdir_async`` called by the rmdir(2) system call. Only required if you want to support deleting subdirectories ``mknod`` +``mknod_async`` called by the mknod(2) system call to create a device (char, block) inode or a named pipe (FIFO) or socket. Only required if you want to support creating these types of inodes. You will @@ -576,6 +598,7 @@ otherwise noted. create() method ``rename`` +``rename_async`` called by the rename(2) system call to rename the object to have the parent and name given by the second inode and dentry. @@ -647,6 +670,7 @@ otherwise noted. itself and call mark_inode_dirty_sync. ``atomic_open`` +``atomic_open_async`` called on the last component of an open. Using this optional method the filesystem can look up, possibly create and open the file in one atomic operation. If it wants to leave actual diff --git a/fs/namei.c b/fs/namei.c index 3c0feca081a2..eadde9de73bf 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -123,6 +123,41 @@ * PATH_MAX includes the nul terminator --RR. */ +static void dirop_done_cb(struct dirop_ret *dret) +{ + wake_up_var(dret); +} + +#define DO_DIROP(dir, op, ...) \ + ({ \ + struct dirop_ret dret; \ + int ret; \ + dret.err = -EINPROGRESS; \ + dret.done_cb = dirop_done_cb; \ + ret = (dir)->i_op->op(__VA_ARGS__, &dret); \ + if (ret == -EINPROGRESS) { \ + wait_var_event(&dret, \ + dret.err != -EINPROGRESS); \ + ret = dret.err; \ + } \ + ret; \ + }) + +#define DO_DE_DIROP(dir, op, ...) \ + ({ \ + struct dirop_ret dret; \ + struct dentry *ret; \ + dret.dentry = ERR_PTR(-EINPROGRESS); \ + dret.done_cb = dirop_done_cb; \ + ret = (dir)->i_op->op(__VA_ARGS__, &dret); \ + if (ret == ERR_PTR(-EINPROGRESS)) { \ + wait_var_event(&dret, \ + dret.dentry != ERR_PTR(-EINPROGRESS)); \ + ret = dret.dentry; \ + } \ + ret; \ + }) + #define EMBEDDED_NAME_MAX (PATH_MAX - offsetof(struct filename, iname)) struct filename * @@ -3403,14 +3438,17 @@ int vfs_create(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - if (!dir->i_op->create) + if (!dir->i_op->create && !dir->i_op->create_async) return -EACCES; /* shouldn't it be ENOSYS? */ mode = vfs_prepare_mode(idmap, dir, mode, S_IALLUGO, S_IFREG); error = security_inode_create(dir, dentry, mode); if (error) return error; - error = dir->i_op->create(idmap, dir, dentry, mode, want_excl); + if (dir->i_op->create_async) + error = DO_DIROP(dir, create_async, idmap, dir, dentry, mode, want_excl); + else + error = dir->i_op->create(idmap, dir, dentry, mode, want_excl); if (!error) fsnotify_create(dir, dentry); return error; @@ -3571,8 +3609,12 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry, file->f_path.dentry = DENTRY_NOT_SET; file->f_path.mnt = nd->path.mnt; - error = dir->i_op->atomic_open(dir, dentry, file, - open_to_namei_flags(open_flag), mode); + if (dir->i_op->atomic_open_async) + error = DO_DIROP(dir, atomic_open_async, dir, dentry, file, + open_to_namei_flags(open_flag), mode); + else + error = dir->i_op->atomic_open(dir, dentry, file, + open_to_namei_flags(open_flag), mode); d_lookup_done(dentry); if (!error) { if (file->f_mode & FMODE_OPENED) { @@ -3680,7 +3722,7 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, } if (create_error) open_flag &= ~O_CREAT; - if (dir_inode->i_op->atomic_open) { + if (dir_inode->i_op->atomic_open || dir_inode->i_op->atomic_open_async) { dentry = atomic_open(nd, dentry, file, open_flag, mode); if (unlikely(create_error) && dentry == ERR_PTR(-ENOENT)) dentry = ERR_PTR(create_error); @@ -3705,13 +3747,16 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, if (!dentry->d_inode && (open_flag & O_CREAT)) { file->f_mode |= FMODE_CREATED; audit_inode_child(dir_inode, dentry, AUDIT_TYPE_CHILD_CREATE); - if (!dir_inode->i_op->create) { - error = -EACCES; - goto out_dput; - } - error = dir_inode->i_op->create(idmap, dir_inode, dentry, - mode, open_flag & O_EXCL); + if (dir_inode->i_op->create_async) + error = DO_DIROP(dir_inode, create_async, idmap, dir_inode, + dentry, mode, open_flag & O_EXCL); + else if (dir_inode->i_op->create) + error = dir_inode->i_op->create(idmap, dir_inode, + dentry, mode, + open_flag & O_EXCL); + else + error = -EACCES; if (error) goto out_dput; } @@ -4217,7 +4262,7 @@ int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir, !capable(CAP_MKNOD)) return -EPERM; - if (!dir->i_op->mknod) + if (!dir->i_op->mknod && !dir->i_op->mknod_async) return -EPERM; mode = vfs_prepare_mode(idmap, dir, mode, mode, mode); @@ -4229,7 +4274,10 @@ int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - error = dir->i_op->mknod(idmap, dir, dentry, mode, dev); + if (dir->i_op->mknod_async) + error = DO_DIROP(dir, mknod_async, idmap, dir, dentry, mode, dev); + else + error = dir->i_op->mknod(idmap, dir, dentry, mode, dev); if (!error) fsnotify_create(dir, dentry); return error; @@ -4340,7 +4388,7 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - if (!dir->i_op->mkdir) + if (!dir->i_op->mkdir && !dir->i_op->mkdir_async) return -EPERM; mode = vfs_prepare_mode(idmap, dir, mode, S_IRWXUGO | S_ISVTX, 0); @@ -4351,7 +4399,16 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (max_links && dir->i_nlink >= max_links) return -EMLINK; - error = dir->i_op->mkdir(idmap, dir, dentry, mode); + if (dir->i_op->mkdir_async) { + struct dentry *de; + de = DO_DE_DIROP(dir, mkdir_async, idmap, dir, dentry, mode); + if (IS_ERR(de)) + error = PTR_ERR(de); + else if (de) + dput(de); + } else { + error = dir->i_op->mkdir(idmap, dir, dentry, mode); + } if (!error) fsnotify_mkdir(dir, dentry); return error; @@ -4399,6 +4456,20 @@ int vfs_mkdir_return(struct mnt_idmap *idmap, struct inode *dir, if (max_links && dir->i_nlink >= max_links) return -EMLINK; + if (dir->i_op->mkdir_async) { + struct dentry *de; + + de = DO_DE_DIROP(dir, mkdir_async, idmap, dir, dentry, mode); + if (IS_ERR(de)) + return PTR_ERR(de); + if (de) { + dput(dentry); + *dentryp = de; + } + fsnotify_mkdir(dir, dentry); + return 0; + } + error = dir->i_op->mkdir(idmap, dir, dentry, mode); if (!error) { fsnotify_mkdir(dir, dentry); @@ -4488,7 +4559,7 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - if (!dir->i_op->rmdir) + if (!dir->i_op->rmdir && !dir->i_op->rmdir_async) return -EPERM; dget(dentry); @@ -4503,7 +4574,10 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, if (error) goto out; - error = dir->i_op->rmdir(dir, dentry); + if (dir->i_op->rmdir_async) + error = DO_DIROP(dir, rmdir_async, dir, dentry); + else + error = dir->i_op->rmdir(dir, dentry); if (error) goto out; @@ -4613,7 +4687,7 @@ int vfs_unlink(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - if (!dir->i_op->unlink) + if (!dir->i_op->unlink && !dir->i_op->unlink_async) return -EPERM; inode_lock(target); @@ -4627,7 +4701,10 @@ int vfs_unlink(struct mnt_idmap *idmap, struct inode *dir, error = try_break_deleg(target, delegated_inode); if (error) goto out; - error = dir->i_op->unlink(dir, dentry); + if (dir->i_op->unlink_async) + error = DO_DIROP(dir, unlink_async, dir, dentry); + else + error = dir->i_op->unlink(dir, dentry); if (!error) { dont_mount(dentry); detach_mounts(dentry); @@ -4761,14 +4838,17 @@ int vfs_symlink(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - if (!dir->i_op->symlink) + if (!dir->i_op->symlink && !dir->i_op->symlink_async) return -EPERM; error = security_inode_symlink(dir, dentry, oldname); if (error) return error; - error = dir->i_op->symlink(idmap, dir, dentry, oldname); + if (dir->i_op->symlink_async) + error = DO_DIROP(dir, symlink_async, idmap, dir, dentry, oldname); + else + error = dir->i_op->symlink(idmap, dir, dentry, oldname); if (!error) fsnotify_create(dir, dentry); return error; @@ -4874,7 +4954,7 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap, */ if (HAS_UNMAPPED_ID(idmap, inode)) return -EPERM; - if (!dir->i_op->link) + if (!dir->i_op->link && !dir->i_op->link_async) return -EPERM; if (S_ISDIR(inode->i_mode)) return -EPERM; @@ -4891,7 +4971,11 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap, error = -EMLINK; else { error = try_break_deleg(inode, delegated_inode); - if (!error) + if (error) + ; + else if (dir->i_op->link_async) + error = DO_DIROP(dir, link_async, old_dentry, dir, new_dentry); + else error = dir->i_op->link(old_dentry, dir, new_dentry); } @@ -5083,7 +5167,7 @@ int vfs_rename(struct renamedata *rd) if (error) return error; - if (!old_dir->i_op->rename) + if (!old_dir->i_op->rename && !old_dir->i_op->rename_async) return -EPERM; /* @@ -5166,8 +5250,14 @@ int vfs_rename(struct renamedata *rd) if (error) goto out; } - error = old_dir->i_op->rename(rd->new_mnt_idmap, old_dir, old_dentry, - new_dir, new_dentry, flags); + if (old_dir->i_op->rename_async) + error = DO_DIROP(old_dir, rename_async, rd->new_mnt_idmap, + old_dir, old_dentry, + new_dir, new_dentry, flags); + else + error = old_dir->i_op->rename(rd->new_mnt_idmap, + old_dir, old_dentry, + new_dir, new_dentry, flags); if (error) goto out; diff --git a/include/linux/fs.h b/include/linux/fs.h index f81d6bc65fe4..e414400c2487 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2187,6 +2187,14 @@ int wrap_directory_iterator(struct file *, struct dir_context *, static int shared_##x(struct file *file , struct dir_context *ctx) \ { return wrap_directory_iterator(file, ctx, x); } +struct dirop_ret { + union { + int err; + struct dentry *dentry; + }; + void (*done_cb)(struct dirop_ret*); +}; + struct inode_operations { struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int); const char * (*get_link) (struct dentry *, struct inode *, struct delayed_call *); @@ -2197,17 +2205,30 @@ struct inode_operations { int (*create) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t, bool); + int (*create_async) (struct mnt_idmap *, struct inode *,struct dentry *, + umode_t, bool, struct dirop_ret *); int (*link) (struct dentry *,struct inode *,struct dentry *); + int (*link_async) (struct dentry *,struct inode *,struct dentry *, struct dirop_ret *); int (*unlink) (struct inode *,struct dentry *); + int (*unlink_async) (struct inode *,struct dentry *, struct dirop_ret *); int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *, const char *); + int (*symlink_async) (struct mnt_idmap *, struct inode *,struct dentry *, + const char *, struct dirop_ret *); int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t); + struct dentry * (*mkdir_async) (struct mnt_idmap *, struct inode *,struct dentry *, + umode_t, struct dirop_ret *); int (*rmdir) (struct inode *,struct dentry *); + int (*rmdir_async) (struct inode *,struct dentry *, struct dirop_ret *); int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t,dev_t); + int (*mknod_async) (struct mnt_idmap *, struct inode *,struct dentry *, + umode_t,dev_t, struct dirop_ret *); int (*rename) (struct mnt_idmap *, struct inode *, struct dentry *, struct inode *, struct dentry *, unsigned int); + int (*rename_async) (struct mnt_idmap *, struct inode *, struct dentry *, + struct inode *, struct dentry *, unsigned int, struct dirop_ret *); int (*setattr) (struct mnt_idmap *, struct dentry *, struct iattr *); int (*getattr) (struct mnt_idmap *, const struct path *, struct kstat *, u32, unsigned int); @@ -2218,6 +2239,9 @@ struct inode_operations { int (*atomic_open)(struct inode *, struct dentry *, struct file *, unsigned open_flag, umode_t create_mode); + int (*atomic_open_async)(struct inode *, struct dentry *, + struct file *, unsigned open_flag, + umode_t create_mode, struct dirop_ret *); int (*tmpfile) (struct mnt_idmap *, struct inode *, struct file *, umode_t); struct posix_acl *(*get_acl)(struct mnt_idmap *, struct dentry *, From patchwork Thu Feb 6 05:42:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962250 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 096B9224B0E; Thu, 6 Feb 2025 05:47:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820832; cv=none; b=UmJAr3UoCaPT+q/w6yO41w3HKhcXRyPx0sk8meQEjUon2m2aUcIJHLcM3ZYY/zMyIwhuVXBikajH14r2Gej90P+LMr89+qGxnziqC9EEfL0ghqz4aSTU7RoJYAyAwSDSxO8XtlUzbOWUQzBWZaeKLN0TvI7Zgj8zSHv/fmxaCCg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820832; c=relaxed/simple; bh=hpiSBFLLA3zJIzrDKR6EC2urgIUr9uaLWZtt1neW9wM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qSsfPrV4iBeQawAd5zxQCfiekTfbQ6rRGI6o7A1jstCxi3euSrP0ljiVlt81B0LUikIV0wbDeodXmg1SP9EfMBpzevBCwXc9YNmBX1+bxgjmus4HcgaI9Ve+Woo2PzLSN6bAT2hF99yXoTFQSzE7LMQZWh4qh7Uc1Dqj6Dz/Q7c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=zvwTqS/x; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=PD87s1nV; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=zvwTqS/x; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=PD87s1nV; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="zvwTqS/x"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="PD87s1nV"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="zvwTqS/x"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="PD87s1nV" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 62AA621108; Thu, 6 Feb 2025 05:47:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820829; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sMpYvVJMWnlLZwdGjgJZ84ZYkT2VEezdAmlsywYkUn0=; b=zvwTqS/xttadY/b8wd01NIIG6xiReBxgTfDj85qjfDM2zb5ycn2/6tqJr7rN5Hi7SNWatn FAKFLSXsu9XD3sdsIOhK7xRIG5du6dHgS9m7ra+9jxTs4AmtN2PKZMbzhGvoPq4L9DiF+P 5/MA06Np36cSOCHIYYRLWOQaJmTYjQ0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820829; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sMpYvVJMWnlLZwdGjgJZ84ZYkT2VEezdAmlsywYkUn0=; b=PD87s1nVWoykQxw/voNOvfrMklaIi0eGnQnOrnFVvzAwfogj6YtufGtnwUiiwr59ECB7p0 7dcaSDXT2wq32wAA== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820829; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sMpYvVJMWnlLZwdGjgJZ84ZYkT2VEezdAmlsywYkUn0=; b=zvwTqS/xttadY/b8wd01NIIG6xiReBxgTfDj85qjfDM2zb5ycn2/6tqJr7rN5Hi7SNWatn FAKFLSXsu9XD3sdsIOhK7xRIG5du6dHgS9m7ra+9jxTs4AmtN2PKZMbzhGvoPq4L9DiF+P 5/MA06Np36cSOCHIYYRLWOQaJmTYjQ0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820829; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sMpYvVJMWnlLZwdGjgJZ84ZYkT2VEezdAmlsywYkUn0=; b=PD87s1nVWoykQxw/voNOvfrMklaIi0eGnQnOrnFVvzAwfogj6YtufGtnwUiiwr59ECB7p0 7dcaSDXT2wq32wAA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 9E0A913795; Thu, 6 Feb 2025 05:47:06 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id qHWXFNpMpGe5BwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:47:06 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 10/19] VFS: introduce inode flags to report locking needs for directory ops Date: Thu, 6 Feb 2025 16:42:47 +1100 Message-ID: <20250206054504.2950516-11-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Level: If a filesystem supports _async ops for some directory ops we can take a "shared" lock on i_rwsem otherwise we must take an "exclusive" lock. As the filesystem may support some async ops but not others we need to easily determine which. With this patch we group the ops into 4 groups that are likely be supported together: CREATE: create, link, mkdir, mknod REMOVE: rmdir, unlink RENAME: rename OPEN: atomic_open, create and set S_ASYNC_XXX for each when the inode in initialised. We also add a LOOKUP_REMOVE intent flag which will be used by locking interfaces to help know which group is being used. Signed-off-by: NeilBrown --- fs/dcache.c | 24 ++++++++++++++++++++++++ include/linux/fs.h | 5 +++++ include/linux/namei.h | 5 +++-- 3 files changed, 32 insertions(+), 2 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index e49607d00d2d..37c0f655166d 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -384,6 +384,27 @@ static inline void __d_set_inode_and_type(struct dentry *dentry, smp_store_release(&dentry->d_flags, flags); } +static void set_inode_flags(struct inode *inode) +{ + const struct inode_operations *i_op = inode->i_op; + + lockdep_assert_held(&inode->i_lock); + if ((i_op->create_async || !i_op->create) && + (i_op->link_async || !i_op->link) && + (i_op->symlink_async || !i_op->symlink) && + (i_op->mkdir_async || !i_op->mkdir) && + (i_op->mknod_async || !i_op->mknod)) + inode->i_flags |= S_ASYNC_CREATE; + if ((i_op->unlink_async || !i_op->unlink) && + (i_op->mkdir_async || !i_op->mkdir)) + inode->i_flags |= S_ASYNC_REMOVE; + if (i_op->rename_async) + inode->i_flags |= S_ASYNC_RENAME; + if (i_op->atomic_open_async || + (!i_op->atomic_open && i_op->create_async)) + inode->i_flags |= S_ASYNC_OPEN; +} + static inline void __d_clear_type_and_inode(struct dentry *dentry) { unsigned flags = READ_ONCE(dentry->d_flags); @@ -1893,6 +1914,7 @@ static void __d_instantiate(struct dentry *dentry, struct inode *inode) raw_write_seqcount_begin(&dentry->d_seq); __d_set_inode_and_type(dentry, inode, add_flags); raw_write_seqcount_end(&dentry->d_seq); + set_inode_flags(inode); fsnotify_update_flags(dentry); spin_unlock(&dentry->d_lock); } @@ -1999,6 +2021,7 @@ static struct dentry *__d_obtain_alias(struct inode *inode, bool disconnected) spin_lock(&new->d_lock); __d_set_inode_and_type(new, inode, add_flags); + set_inode_flags(inode); hlist_add_head(&new->d_u.d_alias, &inode->i_dentry); if (!disconnected) { hlist_bl_lock(&sb->s_roots); @@ -2701,6 +2724,7 @@ static inline void __d_add(struct dentry *dentry, struct inode *inode) raw_write_seqcount_begin(&dentry->d_seq); __d_set_inode_and_type(dentry, inode, add_flags); raw_write_seqcount_end(&dentry->d_seq); + set_inode_flags(inode); fsnotify_update_flags(dentry); } __d_rehash(dentry); diff --git a/include/linux/fs.h b/include/linux/fs.h index e414400c2487..9a9282fef347 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2361,6 +2361,11 @@ struct super_operations { #define S_VERITY (1 << 16) /* Verity file (using fs/verity/) */ #define S_KERNEL_FILE (1 << 17) /* File is in use by the kernel (eg. fs/cachefiles) */ +#define S_ASYNC_CREATE BIT(18) /* create, link, symlink, mkdir, mknod all _async */ +#define S_ASYNC_REMOVE BIT(19) /* unlink, mkdir both _async */ +#define S_ASYNC_RENAME BIT(20) /* rename_async supported */ +#define S_ASYNC_OPEN BIT(21) /* atomic_open_async or create_async supported */ + /* * Note that nosuid etc flags are inode-specific: setting some file-system * flags just means all the inodes inherit those flags by default. It might be diff --git a/include/linux/namei.h b/include/linux/namei.h index 76c587a5ec3a..72e351640406 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -40,10 +40,11 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT}; #define LOOKUP_CREATE BIT(17) /* ... in object creation */ #define LOOKUP_EXCL BIT(18) /* ... in target must not exist */ #define LOOKUP_RENAME_TARGET BIT(19) /* ... in destination of rename() */ +#define LOOKUP_REMOVE BIT(20) /* ... in target of object removal */ #define LOOKUP_INTENT_FLAGS (LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_EXCL | \ - LOOKUP_RENAME_TARGET) -/* 4 spare bits for intent */ + LOOKUP_RENAME_TARGET | LOOKUP_REMOVE) +/* 3 spare bits for intent */ /* Scoping flags for lookup. */ #define LOOKUP_NO_SYMLINKS BIT(24) /* No symlink crossing. */ From patchwork Thu Feb 6 05:42:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962251 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 61A44224893; Thu, 6 Feb 2025 05:47:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820839; cv=none; b=Or8GTeGoeFX7BV92nZNNGNwoNc4xfSpveak1u1ccexyyxu9ncS2xB2nFzSNicmhu2g417Ek3iMLNy5NoqKuc89bvXjwsDkPjb/xTk4bZ5Wvnoa20TPeDZ6zK9U1cL/6t4mucPhoR6k5Wa0Ngyyai5PQMnFp/XUy1UpXWyrHnDNs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820839; c=relaxed/simple; bh=rCm6Se0pWsL4OLFAYwONdyuTr+zXx1uu6RKce+dNOIc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=f4/MkmnSR0UG/5aRnCVVis8N6DNYFngLN6NBuAWFT4YVKX0VEY6NuN+rU1WxLqRvQHf7r0bE7qErZ54JgJbRwkzWXNIpRXW5gZcmFbQhIhN9kSvN4MQZijqIVh6n04mq4ac6f8HRrGnJxVL7IudbhqrDWNwS+KCFC3hqPOD3d/A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=EKUCj0NW; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=FFqQ//Cf; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=EKUCj0NW; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=FFqQ//Cf; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="EKUCj0NW"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="FFqQ//Cf"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="EKUCj0NW"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="FFqQ//Cf" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 9BC1E1F381; Thu, 6 Feb 2025 05:47:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820835; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=o1Nro+4aQorWavuPaXsykpJOlUA1+Wf7rHfEGVcjK+Y=; b=EKUCj0NWGFoWHFlQ8oZCRY/nmyJiojLAZNQ0zUJm5jEY/uj78+t0ls/fjk0yfJ6qjsaVVx ciGAn7xLPZdv5O2HnfpWdYkRyZEls63nC0jZmte1Qp67eOlVB4A5dtqzCAOT3wLLhH3wJl Gc9Jmv+k8enre1F66fqngOyzru5WPWg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820835; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=o1Nro+4aQorWavuPaXsykpJOlUA1+Wf7rHfEGVcjK+Y=; b=FFqQ//CfKo+JOB8ceqWB2JnydBUQrCCSvu/mWmE0Cqk95aKy0o5UMwJjO/+lvrO3cNx6C0 WItQohZpgmSHTiCg== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820835; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=o1Nro+4aQorWavuPaXsykpJOlUA1+Wf7rHfEGVcjK+Y=; b=EKUCj0NWGFoWHFlQ8oZCRY/nmyJiojLAZNQ0zUJm5jEY/uj78+t0ls/fjk0yfJ6qjsaVVx ciGAn7xLPZdv5O2HnfpWdYkRyZEls63nC0jZmte1Qp67eOlVB4A5dtqzCAOT3wLLhH3wJl Gc9Jmv+k8enre1F66fqngOyzru5WPWg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820835; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=o1Nro+4aQorWavuPaXsykpJOlUA1+Wf7rHfEGVcjK+Y=; b=FFqQ//CfKo+JOB8ceqWB2JnydBUQrCCSvu/mWmE0Cqk95aKy0o5UMwJjO/+lvrO3cNx6C0 WItQohZpgmSHTiCg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id D862313795; Thu, 6 Feb 2025 05:47:12 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id stXNIuBMpGfDBwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:47:12 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 11/19] VFS: Add ability to exclusively lock a dentry and use for create/remove operations. Date: Thu, 6 Feb 2025 16:42:48 +1100 Message-ID: <20250206054504.2950516-12-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Level: d_update_lock(), d_update_trylock(), d_update_unlock() are added which can be used to get an exclusive lock on a dentry in preparation for updating it. As contention on a name is rare this is optimised for the uncontended case. A bit is set under the d_lock spinlock to claim as lock, and wait_var_event_spinlock() is used when waiting is needed. To avoid sending a wakeup when not needed we have a second bit flag to indicate if there are any waiters. This locking is used in lookup_and_lock(). Once the exclusive "update" lock is obtained on the dentry we must make sure it wasn't unlinked or renamed while we slept. If it was we repeat the lookup. We also ensure that the parent isn't similarly locked. This is will be used to protect a directory during rmdir. Signed-off-by: NeilBrown --- fs/dcache.c | 5 +- fs/internal.h | 18 +++++++ fs/namei.c | 110 ++++++++++++++++++++++++++++++++++++++++- include/linux/dcache.h | 4 ++ 4 files changed, 134 insertions(+), 3 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 37c0f655166d..e705696ca57e 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1675,9 +1675,10 @@ EXPORT_SYMBOL(d_invalidate); * available. On a success the dentry is returned. The name passed in is * copied and the copy passed in may be reused after this call. */ - + static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) { + static struct lock_class_key __key; struct dentry *dentry; char *dname; int err; @@ -1735,6 +1736,8 @@ static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) INIT_HLIST_NODE(&dentry->d_sib); d_set_d_op(dentry, dentry->d_sb->s_d_op); + lockdep_init_map(&dentry->d_update_map, "DCACHE_PAR_UPDATE", &__key, 0); + if (dentry->d_op && dentry->d_op->d_init) { err = dentry->d_op->d_init(dentry); if (err) { diff --git a/fs/internal.h b/fs/internal.h index e7f02ae1e098..5cb9a34e26e8 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -225,6 +225,24 @@ extern struct dentry *__d_lookup_rcu(const struct dentry *parent, const struct qstr *name, unsigned *seq); extern void d_genocide(struct dentry *); +extern bool d_update_lock(struct dentry *dentry, + struct dentry *base, const struct qstr *last, + unsigned int subclass); + +extern bool d_update_trylock(struct dentry *dentry, + struct dentry *base, + const struct qstr *last); + +static inline void d_update_unlock(struct dentry *dentry) +{ + lock_map_release(&dentry->d_update_map); + spin_lock(&dentry->d_lock); + if (dentry->d_flags & DCACHE_PAR_WAITER) + wake_up_var_locked(&dentry->d_flags, &dentry->d_lock); + dentry->d_flags &= ~(DCACHE_PAR_UPDATE | DCACHE_PAR_WAITER); + spin_unlock(&dentry->d_lock); +} + /* * pipe.c */ diff --git a/fs/namei.c b/fs/namei.c index eadde9de73bf..145ae07f9b8c 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1750,6 +1750,110 @@ struct dentry *lookup_one_qstr(const struct qstr *name, } EXPORT_SYMBOL(lookup_one_qstr); +/* + * dentry locking for updates. + * When modifying a directory the target dentry will be locked by + * setting DCACHE_PAR_UPDATE under ->d_lock. If it is already set, + * DCACHE_PAR_WAITER is set to ensure a wakeup is sent, and we wait + * using wait_var_event_spinlock(). + * The DCACHE_PAR_UPDATE bit will only be set in a denty if it is + * NOT set in the parent. This avoids commensing a new operation in + * a directory that is being asynchronously deleted using ->mkdir_async. + * Instead of holding ->d_lock on the parent while testing the flag, we + * use memory ordering to ensure correctness. Locking a child + * retests the parent *after* setting the bit, and deleting a directory + * requires testing all children *after* setting the bit in the parent. + */ + +static bool check_dentry_locked(struct dentry *de) +{ + if (de->d_flags & DCACHE_PAR_UPDATE) { + de->d_flags |= DCACHE_PAR_WAITER; + return true; + } + return false; +} + +bool d_update_lock(struct dentry *dentry, + struct dentry *base, const struct qstr *last, + unsigned int subclass) +{ + lock_acquire_exclusive(&dentry->d_update_map, subclass, 0, NULL, _THIS_IP_); +again: + spin_lock(&dentry->d_lock); + wait_var_event_spinlock(&dentry->d_flags, + !check_dentry_locked(dentry), + &dentry->d_lock); + if (d_is_positive(dentry)) { + rcu_read_lock(); /* needed for d_same_name() */ + if ( + /* Was unlinked while we waited ?*/ + d_unhashed(dentry) || + /* Or was dentry renamed ?? */ + dentry->d_parent != base || + dentry->d_name.hash != last->hash || + !d_same_name(dentry, base, last) + ) { + rcu_read_unlock(); + spin_unlock(&dentry->d_lock); + lock_map_release(&dentry->d_update_map); + return false; + } + rcu_read_unlock(); + } + /* Must ensure DCACHE_PAR_UPDATE in child is visible before reading + * from parent + */ + smp_store_mb(dentry->d_flags, dentry->d_flags | DCACHE_PAR_UPDATE); + if (base->d_flags & DCACHE_PAR_UPDATE) { + /* We cannot grant DCACHE_PAR_UPDATE on a dentry while + * it is held on the parent + */ + dentry->d_flags &= ~DCACHE_PAR_UPDATE; + spin_unlock(&dentry->d_lock); + spin_lock(&base->d_lock); + wait_var_event_spinlock(&base->d_flags, + !check_dentry_locked(base), + &base->d_lock); + spin_unlock(&base->d_lock); + goto again; + } + spin_unlock(&dentry->d_lock); + return true; +} + +bool d_update_trylock(struct dentry *dentry, + struct dentry *base, + const struct qstr *last) +{ + int ret = false; + + spin_lock(&dentry->d_lock); + rcu_read_lock(); /* needed for d_same_name() */ + if (!(smp_load_acquire(&dentry->d_flags) & DCACHE_PAR_UPDATE) && + !(dentry->d_parent->d_flags & DCACHE_PAR_UPDATE)) { + if (!base || !( + /* Was unlinked before we got spinlock ?*/ + d_unhashed(dentry) || + /* Or was dentry renamed ?? */ + dentry->d_parent != base || + dentry->d_name.hash != last->hash || + !d_same_name(dentry, base, last) + )) { + lock_map_acquire_try(&dentry->d_update_map); + smp_store_mb(dentry->d_flags, + dentry->d_flags | DCACHE_PAR_UPDATE); + if (dentry->d_parent->d_flags & DCACHE_PAR_UPDATE) + dentry->d_flags &= ~DCACHE_PAR_UPDATE; + else + ret = true; + } + } + rcu_read_unlock(); + spin_unlock(&dentry->d_lock); + return ret; +} + static struct dentry *lookup_and_lock_nested(const struct qstr *last, struct dentry *base, unsigned int lookup_flags, @@ -1759,8 +1863,9 @@ static struct dentry *lookup_and_lock_nested(const struct qstr *last, if (!(lookup_flags & LOOKUP_PARENT_LOCKED)) inode_lock_nested(base->d_inode, subclass); - - dentry = lookup_one_qstr(last, base, lookup_flags); + do { + dentry = lookup_one_qstr(last, base, lookup_flags); + } while (!IS_ERR(dentry) && !d_update_lock(dentry, base, last, subclass)); if (IS_ERR(dentry) && !(lookup_flags & LOOKUP_PARENT_LOCKED)) { inode_unlock(base->d_inode); } @@ -1779,6 +1884,7 @@ void done_lookup_and_lock(struct dentry *base, struct dentry *dentry, unsigned int lookup_flags) { d_lookup_done(dentry); + d_update_unlock(dentry); dput(dentry); if (!(lookup_flags & LOOKUP_PARENT_LOCKED)) inode_unlock(base->d_inode); diff --git a/include/linux/dcache.h b/include/linux/dcache.h index d5816cf19538..f891fb1be63b 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -111,6 +111,8 @@ struct dentry { * possible! */ + /* lockdep tracking of DCACHE_PAR_UPDATE locks */ + struct lockdep_map d_update_map; union { struct list_head d_lru; /* LRU list */ wait_queue_head_t *d_wait; /* in-lookup ones only */ @@ -232,6 +234,8 @@ struct dentry_operations { #define DCACHE_DENTRY_CURSOR BIT(25) #define DCACHE_NORCU BIT(26) /* No RCU delay for freeing */ +#define DCACHE_PAR_UPDATE BIT(27) /* Locked for update */ +#define DCACHE_PAR_WAITER BIT(28) /* someone is waiting for PAR_UPDATE */ extern seqlock_t rename_lock; /* From patchwork Thu Feb 6 05:42:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962252 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 944C81B59A; Thu, 6 Feb 2025 05:47:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820849; cv=none; b=Hh7fOmB0pPkZRou3OYkpleOI2bo6JmnHCVGC4FxZ/x649OaPw2dNqojpoAKlH25sRT8t65qcNBxxJqBU5G/lw0FU+IkypYftM3yB/xRc73w/7NOfQ7aeQzRhyZDKzHUM97q20T1zJEwozne1ByksAeE1+Qix8KHwgf3Szy1orwE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820849; c=relaxed/simple; bh=1stp1MuIyBM3C4v6cuOVBi2Ieq0TFzc6HI42fSPXfu4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=q2XmoTaNgVd5v4KwcFFbnEE6KZvWSu6ip+BzQrBWgtqLBsG4yY4fZ4yB6yAy7Kxelgu0/IHQtYtlS++8H+68MJ4qmDPyjyyutKuWlN2uYvNUEITXTZI0G+KZjoBUaS1EoE1Lb+frCHO0RxTDNKhnKAdVtcoldoIdRbfTtRK+D7Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=aIeTf06L; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=0+9lEa31; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=aIeTf06L; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=0+9lEa31; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="aIeTf06L"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="0+9lEa31"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="aIeTf06L"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="0+9lEa31" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id CD3821F381; Thu, 6 Feb 2025 05:47:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820845; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J4OKcyBzOYgr/dhyXLlvhMO4GQauunZVDnxngPVrXFM=; b=aIeTf06LtedqZeC5uqSdHgS4xhkfv4UD675jOw/0Ut2LdV5pji3Xa9Q87+Y3JYIoSO60ka IalXlVfBlT22RdlLMYZh+1kI6is/yLsmWDoGHwtp/8FrBaNBoLwW/fVWCcB3ae7HdKy9is DNKI3TKz+yruY3dQqGuZyMhWkI9AUfo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820845; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J4OKcyBzOYgr/dhyXLlvhMO4GQauunZVDnxngPVrXFM=; b=0+9lEa31bbT6Q6MZbHVExWA70Vc7hukXpf5MZi3tMH4s6YCANeFOePXoGUdk9F1WD6UHQZ 185zjvp6zvhNQKDg== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820845; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J4OKcyBzOYgr/dhyXLlvhMO4GQauunZVDnxngPVrXFM=; b=aIeTf06LtedqZeC5uqSdHgS4xhkfv4UD675jOw/0Ut2LdV5pji3Xa9Q87+Y3JYIoSO60ka IalXlVfBlT22RdlLMYZh+1kI6is/yLsmWDoGHwtp/8FrBaNBoLwW/fVWCcB3ae7HdKy9is DNKI3TKz+yruY3dQqGuZyMhWkI9AUfo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820845; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J4OKcyBzOYgr/dhyXLlvhMO4GQauunZVDnxngPVrXFM=; b=0+9lEa31bbT6Q6MZbHVExWA70Vc7hukXpf5MZi3tMH4s6YCANeFOePXoGUdk9F1WD6UHQZ 185zjvp6zvhNQKDg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 176DD13795; Thu, 6 Feb 2025 05:47:22 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id gp9VL+pMpGfLBwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:47:22 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 12/19] VFS: enhance d_splice_alias to accommodate shared-lock updates Date: Thu, 6 Feb 2025 16:42:49 +1100 Message-ID: <20250206054504.2950516-13-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Level: d_splice_alias() - via __d_unalias() - currently assumes that taking a shared lock on the parent directory locks against any change to the parent/name of the dentry. This will no longer be the case with shared-lock updates. We also need a DCACHE_PAR_UPDATE lock on the dentry. This patch adds a call to d_update_trylock() to get this lock -if possible. This lock ensures that the test on ->d_parent and ->d_name in d_update_lock() will not be invalidated by the __d_move() in __d_unalias. Signed-off-by: NeilBrown --- fs/dcache.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/dcache.c b/fs/dcache.c index e705696ca57e..fb331596f1b1 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -3036,13 +3036,17 @@ static int __d_unalias(struct dentry *dentry, struct dentry *alias) goto out_err; m2 = &alias->d_parent->d_inode->i_rwsem; out_unalias: + if (!d_update_trylock(dentry, NULL, NULL)) + goto out_err; if (alias->d_op && alias->d_op->d_unalias_trylock && !alias->d_op->d_unalias_trylock(alias)) - goto out_err; + goto out_err2; __d_move(alias, dentry, false); if (alias->d_op && alias->d_op->d_unalias_unlock) alias->d_op->d_unalias_unlock(alias); ret = 0; +out_err2: + d_update_unlock(dentry); out_err: if (m2) up_read(m2); @@ -3073,6 +3077,10 @@ static int __d_unalias(struct dentry *dentry, struct dentry *alias) * In that case, we know that the inode will be a regular file, and also this * will only occur during atomic_open. So we need to check for the dentry * being already hashed only in the final case. + * + * @dentry must have a valid ->d_parent and that directory must be + * locked (i_rwsem) either exclusively or shared. If shared then + * @dentry must have %DCACHE_PAR_LOOKUP or %DCACHE_PAR_UPDATE set. */ struct dentry *d_splice_alias(struct inode *inode, struct dentry *dentry) { From patchwork Thu Feb 6 05:42:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962253 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 06B2B2236EE; Thu, 6 Feb 2025 05:47:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820860; cv=none; b=uMvtyKEsIGwuVbW4y1SpHXoHp3mL8V7m5AN1uZhTx0eoDBUfcKNqooOXpPc4s8FnQQn2UB4cITvG8M/85gSFrIgFdBrpuM2gMMsvzZI63HZ2TwFmLiUyTQKVD122a7upheo0seytn97ZhJgwNBi5WflIFlZaRiOoydMW8L3A+as= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820860; c=relaxed/simple; bh=i/oHb05z3TPZXb3xfHjmEd5RoH7nMeWAlCEX/lHyz9s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CuOBn7TuSfnZ77x1oA2bYa/qLl6h5W1piBuoSK12fJeNH0tkWTlB+yj8pbdK9brdZwHfkbymX8b75xV9ExFjzNmOHNUlTBJKy/xTv/ipScsRrsrhMbTXSHuHbUw1ACBOOLtJz2q/qrYQb0HjTB8JAHs5ZyEF7olUQMtdiDfYdsE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=kb77S9c/; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=TuDkQstM; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=PwxcEp8L; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=1fkhx17U; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="kb77S9c/"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="TuDkQstM"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="PwxcEp8L"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="1fkhx17U" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 12D7421108; Thu, 6 Feb 2025 05:47:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820857; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NRkrNPpQOJiLW6OESPgTO2Ruu6UWEe+cDZW/FJFZUQM=; b=kb77S9c/ey3iORTolnGFVLvsIDbEcoEMjSBLkL44w1NQUYUbzzqOjr3eC6rVXYm8WDQJHw nDNKlLI3fTajfBXZxtBUC9V91lucrLpjbmV7SNgN/SqGH0G1wfUFWJnc/O1BZh/vm8QG3R tcF+zv/6683AQXBm0jEctkGBgWHXyXU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820857; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NRkrNPpQOJiLW6OESPgTO2Ruu6UWEe+cDZW/FJFZUQM=; b=TuDkQstMXNgX55WfxYja1BAhIx3hG6QuRuAEvy6ftYcVJYuzrINX3DIRG/IpXqG8yX9QJD D3h/2WAf3LQjcXBA== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820856; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NRkrNPpQOJiLW6OESPgTO2Ruu6UWEe+cDZW/FJFZUQM=; b=PwxcEp8LKV/ahuMPFsFdSmKd3gYWPn3gHx0dklWTs1neVgx8rlVg+r2SmUi/StFOEa0WFa G0q+N5aQATE+wBYBbdptLwydE9i63W0lWvh3k1TU74jzyJFII73W3ApBgpnNhRO4rEv61I /Nizl7yADTUUVmyY85cwJ2/FWgyjLwo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820856; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NRkrNPpQOJiLW6OESPgTO2Ruu6UWEe+cDZW/FJFZUQM=; b=1fkhx17UrgNVS0ExTl7OpqqjH2+s/izFyrbxF4tfBEuO4/OZnmgXgjb5ISwqvRPF167KDa DF5D03ROPV03fLBg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 514D213795; Thu, 6 Feb 2025 05:47:33 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id fWPOAfVMpGfVBwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:47:33 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 13/19] VFS: lock dentry for ->revalidate to avoid races with rename etc Date: Thu, 6 Feb 2025 16:42:50 +1100 Message-ID: <20250206054504.2950516-14-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Level: X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,suse.de:email,suse.de:mid]; RCVD_TLS_ALL(0.00)[] X-Spam-Score: -2.80 X-Spam-Flag: NO When we call ->revalidate we want to be sure we are revalidating the expected name. As a shared lock on i_rwsem no longer prevents renames we need to lock the dentry and ensure it still has the expected name. So pass parent name to d_revalidate() and be prepared to retry the lookup if it returns -EAGAIN. Signed-off-by: NeilBrown --- fs/namei.c | 49 ++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 38 insertions(+), 11 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 145ae07f9b8c..3a107d6098be 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -957,12 +957,24 @@ static bool try_to_unlazy_next(struct nameidata *nd, struct dentry *dentry) } static inline int d_revalidate(struct inode *dir, const struct qstr *name, - struct dentry *dentry, unsigned int flags) + struct dentry *dentry, unsigned int flags, + struct dentry *base, const struct qstr *last) { - if (unlikely(dentry->d_flags & DCACHE_OP_REVALIDATE)) - return dentry->d_op->d_revalidate(dir, name, dentry, flags); - else + int status; + + if (!unlikely(dentry->d_flags & DCACHE_OP_REVALIDATE)) return 1; + + if (dentry->d_flags & LOOKUP_RCU) { + if (!d_update_trylock(dentry, base, last)) + return -ECHILD; + } else { + if (!d_update_lock(dentry, base, last, I_MUTEX_NORMAL)) + return -EAGAIN; + } + status = dentry->d_op->d_revalidate(dir, name, dentry, flags); + d_update_unlock(dentry); + return status; } /** @@ -1686,13 +1698,18 @@ static struct dentry *lookup_dcache(const struct qstr *name, struct dentry *dir, unsigned int flags) { - struct dentry *dentry = d_lookup(dir, name); + struct dentry *dentry; +again: + dentry = d_lookup(dir, name); if (dentry) { - int error = d_revalidate(dir->d_inode, name, dentry, flags); + int error = d_revalidate(dir->d_inode, name, dentry, flags, dir, name); if (unlikely(error <= 0)) { if (!error) d_invalidate(dentry); dput(dentry); + if (error == -EAGAIN) + /* raced with rename etc */ + goto again; return ERR_PTR(error); } } @@ -1915,6 +1932,7 @@ static struct dentry *lookup_fast(struct nameidata *nd) * of a false negative due to a concurrent rename, the caller is * going to fall back to non-racy lookup. */ +again: if (nd->flags & LOOKUP_RCU) { dentry = __d_lookup_rcu(parent, &nd->last, &nd->next_seq); if (unlikely(!dentry)) { @@ -1930,7 +1948,7 @@ static struct dentry *lookup_fast(struct nameidata *nd) if (read_seqcount_retry(&parent->d_seq, nd->seq)) return ERR_PTR(-ECHILD); - status = d_revalidate(nd->inode, &nd->last, dentry, nd->flags); + status = d_revalidate(nd->inode, &nd->last, dentry, nd->flags, parent, &nd->last); if (likely(status > 0)) return dentry; if (!try_to_unlazy_next(nd, dentry)) @@ -1938,17 +1956,19 @@ static struct dentry *lookup_fast(struct nameidata *nd) if (status == -ECHILD) /* we'd been told to redo it in non-rcu mode */ status = d_revalidate(nd->inode, &nd->last, - dentry, nd->flags); + dentry, nd->flags, parent, &nd->last); } else { dentry = __d_lookup(parent, &nd->last); if (unlikely(!dentry)) return NULL; - status = d_revalidate(nd->inode, &nd->last, dentry, nd->flags); + status = d_revalidate(nd->inode, &nd->last, dentry, nd->flags, parent, &nd->last); } if (unlikely(status <= 0)) { if (!status) d_invalidate(dentry); dput(dentry); + if (status == -EAGAIN) + goto again; return ERR_PTR(status); } return dentry; @@ -1970,7 +1990,7 @@ static struct dentry *__lookup_slow(const struct qstr *name, if (IS_ERR(dentry)) return dentry; if (unlikely(!d_in_lookup(dentry))) { - int error = d_revalidate(inode, name, dentry, flags); + int error = d_revalidate(inode, name, dentry, flags, dir, name); if (unlikely(error <= 0)) { if (!error) { d_invalidate(dentry); @@ -1978,6 +1998,8 @@ static struct dentry *__lookup_slow(const struct qstr *name, goto again; } dput(dentry); + if (error == -EAGAIN) + goto again; dentry = ERR_PTR(error); } } else { @@ -3777,6 +3799,7 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, return ERR_PTR(-ENOENT); file->f_mode &= ~FMODE_CREATED; +again: dentry = d_lookup(dir, &nd->last); for (;;) { if (!dentry) { @@ -3787,9 +3810,13 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, if (d_in_lookup(dentry)) break; - error = d_revalidate(dir_inode, &nd->last, dentry, nd->flags); + error = d_revalidate(dir_inode, &nd->last, dentry, nd->flags, dir, &nd->last); if (likely(error > 0)) break; + if (error == -EAGAIN) { + dput(dentry); + goto again; + } if (error) goto out_dput; d_invalidate(dentry); From patchwork Thu Feb 6 05:42:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962254 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 36BD2225795; Thu, 6 Feb 2025 05:47:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820865; cv=none; b=OrGSMce2uBd1pY7FQ9HZAEhO1r3VWrgCnnmXQ7sx3ZHeV6/qf/wFOeDY1g15KKl0Mz02lsDrEPNUGshdP9htH+Dr2wj76E4AYV7wRqg2GRO4pmKoSC/+4oJ2QdkpWIHcdEAJ8QAnVEL7fHffD4BD5C9sdbRkzXXEJ6W15Ig2uGA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820865; c=relaxed/simple; bh=5mF5w1V9yLV67g4tPn5QDLeBlj+DVVj/m7JDd7Fc69E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=I3ryX0QoTUjn0DDnclmcLZADCp1etcvazbhzh68lwW9hzAz/VEDB21DFnhJpmkguRXFQpV1+bbTm7o9sJqAxbFnBoobY8NIgQQ4zMOwBNwnJ4N34145/oIHQw4P/j9cd4tqYvE2biOxFJluEbS9nE1Dn6rhHKd2YFbI82HThL2g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=gvN1p34O; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=cYMM+LoC; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=gvN1p34O; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=cYMM+LoC; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="gvN1p34O"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="cYMM+LoC"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="gvN1p34O"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="cYMM+LoC" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 487381F381; Thu, 6 Feb 2025 05:47:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820862; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ISQS0x5SwttOGqFgesE90/7bW1l09gCDc3rMRbJYWAc=; b=gvN1p34O1OfTDE1faavLnBJfYxUnTfH8FYX4sUqMxrTcap9b+I28pZuhJ5XweFLmc52sOT +vSzIOuBMOfyLF10vtQDj2YHa06QFHl17bIa6rjCx9w6/vor0q+auksvqEVYPRVKLBMQqu CZn0u5X6bWe8DeWVIundeerEMAtRXe0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820862; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ISQS0x5SwttOGqFgesE90/7bW1l09gCDc3rMRbJYWAc=; b=cYMM+LoCdkmPim/VD4JpjZxn5/iFIRjDUftPhy0t/sY7K7eHo0wSDEtXoUHO4ta/39HxC5 QbLRM/gVIg+oplCw== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=gvN1p34O; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=cYMM+LoC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820862; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ISQS0x5SwttOGqFgesE90/7bW1l09gCDc3rMRbJYWAc=; b=gvN1p34O1OfTDE1faavLnBJfYxUnTfH8FYX4sUqMxrTcap9b+I28pZuhJ5XweFLmc52sOT +vSzIOuBMOfyLF10vtQDj2YHa06QFHl17bIa6rjCx9w6/vor0q+auksvqEVYPRVKLBMQqu CZn0u5X6bWe8DeWVIundeerEMAtRXe0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820862; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ISQS0x5SwttOGqFgesE90/7bW1l09gCDc3rMRbJYWAc=; b=cYMM+LoCdkmPim/VD4JpjZxn5/iFIRjDUftPhy0t/sY7K7eHo0wSDEtXoUHO4ta/39HxC5 QbLRM/gVIg+oplCw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 86D8C13795; Thu, 6 Feb 2025 05:47:39 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id l4jkDvtMpGflBwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:47:39 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 14/19] VFS: Ensure no async updates happening in directory being removed. Date: Thu, 6 Feb 2025 16:42:51 +1100 Message-ID: <20250206054504.2950516-15-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 487381F381 X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo,suse.de:dkim,suse.de:mid,suse.de:email]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Level: vfs_rmdir takes an exclusive lock on the target directory to ensure nothing new is created in it while the rmdir progresses. With the possibility of async updates continuing after the inode lock is dropped we now need extra protection. Any async updates will have DCACHE_PAR_UPDATE set on the dentry. We simply wait for that flag to be cleared on all children. Signed-off-by: NeilBrown --- fs/dcache.c | 2 +- fs/namei.c | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 41 insertions(+), 1 deletion(-) diff --git a/fs/dcache.c b/fs/dcache.c index fb331596f1b1..90dee859d138 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -53,7 +53,7 @@ * - d_lru * - d_count * - d_unhashed() - * - d_parent and d_chilren + * - d_parent and d_children * - childrens' d_sib and d_parent * - d_u.d_alias, d_inode * diff --git a/fs/namei.c b/fs/namei.c index 3a107d6098be..e8a85c9f431c 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1839,6 +1839,27 @@ bool d_update_lock(struct dentry *dentry, return true; } +static void d_update_wait(struct dentry *dentry, unsigned int subclass) +{ + /* Note this may only ever be called in a context where we have + * a lock preventing this dentry from becoming locked, possibly + * an update lock on the parent dentry. The must be a smp_mb() + * after that lock is taken and before this is called so that + * the following test is safe. d_update_lock() provides that + * barrier. + */ + if (!(dentry->d_flags & DCACHE_PAR_UPDATE)) + return + lock_acquire_exclusive(&dentry->d_update_map, subclass, + 0, NULL, _THIS_IP_); + spin_lock(&dentry->d_lock); + wait_var_event_spinlock(&dentry->d_flags, + !check_dentry_locked(dentry), + &dentry->d_lock); + spin_unlock(&dentry->d_lock); + lock_map_release(&dentry->d_update_map); +} + bool d_update_trylock(struct dentry *dentry, struct dentry *base, const struct qstr *last) @@ -4688,6 +4709,7 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry) { int error = may_delete(idmap, dir, dentry, 1); + struct dentry *child; if (error) return error; @@ -4697,6 +4719,24 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, dget(dentry); inode_lock(dentry->d_inode); + /* + * Some children of dentry might be active in an async update. + * We need to wait for them. New children cannot be locked + * while the inode lock is held. + */ +again: + spin_lock(&dentry->d_lock); + for (child = d_first_child(dentry); child; + child = d_next_sibling(child)) { + if (child->d_flags & DCACHE_PAR_UPDATE) { + dget(child); + spin_unlock(&dentry->d_lock); + d_update_wait(child, I_MUTEX_CHILD); + dput(child); + goto again; + } + } + spin_unlock(&dentry->d_lock); error = -EBUSY; if (is_local_mountpoint(dentry) || From patchwork Thu Feb 6 05:42:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962255 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 45A9A1917E7; Thu, 6 Feb 2025 05:47:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820871; cv=none; b=lncKPjYomELBvePjAUEONCVrByVPAN6mH2arJcoPp3v3JBIDTSIQqROSkswlcgAJ0RnVk1JDLBiMfczwZI4e9y7fPQ4++S5IBrKQZ/PhNumHZq6irYQRh1podSnpTZrrWhmeKzmk9PZkICRXt9ea1D/KcZV1hJMDX91RCodBKLQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820871; c=relaxed/simple; bh=5rb3dCvIL+H7JSAVIxkaOakHRBqE6OO9tk8d8n5onME=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iy4ttyb10UVKoLXunJv8/MbYI/XEJtNGERqucRyHRh6FZGANatN5F5QSDFDG3dnIfZNRYxdIThhXGtulOgpVyopX3Nx/9EehkCDuTUGIoctb7nmL0fFnpq9fcnFfrVPzBi+TOXKaSD1IDeyyHkCxHld1Ddou0NnN4xkoOLdYjRI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=S9yAQuLQ; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=0/B3ygud; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=S9yAQuLQ; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=0/B3ygud; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="S9yAQuLQ"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="0/B3ygud"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="S9yAQuLQ"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="0/B3ygud" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 8FB4D1F381; Thu, 6 Feb 2025 05:47:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820868; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rg8jyQtR1PM07jDy9w8zaloI2V0UH9zumpbWDHes2ss=; b=S9yAQuLQ2LTurbYoxNPnMB6T/tzb3tYHHGFn2+c7xqOqpe9eBVqeqW0NZOLHJ8DD0j8Ip5 WmoRejNVVoM9UAHIcEn6SHeNO4lieIJ6dyoSAp7MzDty3Kh4Yj8FP0e6MC2xaFB6FiCh5y 5qTp+ToX3mSz8ulsW7WoxcLcrEyYzWA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820868; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rg8jyQtR1PM07jDy9w8zaloI2V0UH9zumpbWDHes2ss=; b=0/B3ygudQRPJoXl7t1AcTSk2/OJYTr4P1U7uHsUJdtsIazkoasqB++fV6vZM5KAFQBeGkF uGfu3ITmc4bPOZAQ== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820868; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rg8jyQtR1PM07jDy9w8zaloI2V0UH9zumpbWDHes2ss=; b=S9yAQuLQ2LTurbYoxNPnMB6T/tzb3tYHHGFn2+c7xqOqpe9eBVqeqW0NZOLHJ8DD0j8Ip5 WmoRejNVVoM9UAHIcEn6SHeNO4lieIJ6dyoSAp7MzDty3Kh4Yj8FP0e6MC2xaFB6FiCh5y 5qTp+ToX3mSz8ulsW7WoxcLcrEyYzWA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820868; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rg8jyQtR1PM07jDy9w8zaloI2V0UH9zumpbWDHes2ss=; b=0/B3ygudQRPJoXl7t1AcTSk2/OJYTr4P1U7uHsUJdtsIazkoasqB++fV6vZM5KAFQBeGkF uGfu3ITmc4bPOZAQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id C78F413795; Thu, 6 Feb 2025 05:47:45 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 2ZSUHgFNpGfrBwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:47:45 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 15/19] VFS: Change lookup_and_lock() to use shared lock when possible. Date: Thu, 6 Feb 2025 16:42:52 +1100 Message-ID: <20250206054504.2950516-16-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; ARC_NA(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; FUZZY_BLOCKED(0.00)[rspamd.com]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; R_RATELIMIT(0.00)[to_ip_from(RLg91jkc8ace7pgw6s4553jw4p),from(RLewrxuus8mos16izbn)]; TO_DN_SOME(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_SEVEN(0.00)[8]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email,imap1.dmz-prg2.suse.org:helo] X-Spam-Flag: NO X-Spam-Level: lookup_and_lock() and done_lookup_and_lock() are now told, via LOOKUP_ intent flags what operation is being performed, including a new LOOKUP_REMOVE. They use this to determine whether shared or exclusive locking is needed. If all filesystems eventually support all async interface, this locking can be discarded. Signed-off-by: NeilBrown --- fs/namei.c | 40 ++++++++++++++++++++++++++++++++-------- 1 file changed, 32 insertions(+), 8 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index e8a85c9f431c..c7b7445c770e 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1898,13 +1898,26 @@ static struct dentry *lookup_and_lock_nested(const struct qstr *last, unsigned int subclass) { struct dentry *dentry; + unsigned int shared = 0; - if (!(lookup_flags & LOOKUP_PARENT_LOCKED)) - inode_lock_nested(base->d_inode, subclass); + if (!(lookup_flags & LOOKUP_PARENT_LOCKED)) { + if (lookup_flags & LOOKUP_CREATE) + shared = S_ASYNC_CREATE; + if (lookup_flags & LOOKUP_REMOVE) + shared = S_ASYNC_REMOVE; + + if (base->d_inode->i_flags & shared) + inode_lock_shared_nested(base->d_inode, subclass); + else + inode_lock_nested(base->d_inode, subclass); + } do { dentry = lookup_one_qstr(last, base, lookup_flags); } while (!IS_ERR(dentry) && !d_update_lock(dentry, base, last, subclass)); if (IS_ERR(dentry) && !(lookup_flags & LOOKUP_PARENT_LOCKED)) { + if (base->d_inode->i_flags & shared) + inode_unlock_shared(base->d_inode); + else inode_unlock(base->d_inode); } return dentry; @@ -1921,11 +1934,22 @@ static struct dentry *lookup_and_lock(const struct qstr *last, void done_lookup_and_lock(struct dentry *base, struct dentry *dentry, unsigned int lookup_flags) { + unsigned int shared = 0; + + if (lookup_flags & LOOKUP_CREATE) + shared = S_ASYNC_CREATE; + if (lookup_flags & LOOKUP_REMOVE) + shared = S_ASYNC_REMOVE; + d_lookup_done(dentry); d_update_unlock(dentry); dput(dentry); - if (!(lookup_flags & LOOKUP_PARENT_LOCKED)) - inode_unlock(base->d_inode); + if (!(lookup_flags & LOOKUP_PARENT_LOCKED)) { + if (base->d_inode->i_flags & shared) + inode_unlock_shared(base->d_inode); + else + inode_unlock(base->d_inode); + } } EXPORT_SYMBOL(done_lookup_and_lock); @@ -4004,7 +4028,7 @@ static const char *open_last_lookups(struct nameidata *nd, * dropping this one anyway. */ } - if (open_flag & O_CREAT) + if ((open_flag & O_CREAT) && !(dir->d_inode->i_flags & S_ASYNC_OPEN)) inode_lock(dir->d_inode); else inode_lock_shared(dir->d_inode); @@ -4015,7 +4039,7 @@ static const char *open_last_lookups(struct nameidata *nd, if (file->f_mode & FMODE_OPENED) fsnotify_open(file); } - if (open_flag & O_CREAT) + if ((open_flag & O_CREAT) && !(dir->d_inode->i_flags & S_ASYNC_OPEN)) inode_unlock(dir->d_inode); else inode_unlock_shared(dir->d_inode); @@ -4775,7 +4799,7 @@ int do_rmdir(int dfd, struct filename *name) struct path path; struct qstr last; int type; - unsigned int lookup_flags = 0; + unsigned int lookup_flags = LOOKUP_REMOVE; retry: error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type); if (error) @@ -4914,7 +4938,7 @@ int do_unlinkat(int dfd, struct filename *name) int type; struct inode *inode = NULL; struct inode *delegated_inode = NULL; - unsigned int lookup_flags = 0; + unsigned int lookup_flags = LOOKUP_REMOVE; retry: error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type); if (error) From patchwork Thu Feb 6 05:42:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962256 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E2C20158536; Thu, 6 Feb 2025 05:48:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820883; cv=none; b=sXNm/z93aprrppee3twodcnwz5wBOV0NSxwMk46wMDa5QDhr7CYbI0VTpcNpNs5rmu9XqdL7dOhvvvzfLmV4qg56KdkgMXwXsMBqOMdYaQ5rnc+x7bgF12EcO/AnJTgpaicUERQq3s4TdUIaPDy8dWovm40DuyFC4dsOC/kVO64= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820883; c=relaxed/simple; bh=dcBHscAnQxKr/0nixkVMiN0m1RDGzqYV4BOPx4PsCVo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AzXjluaVm828V9nrCb7VNz5AjAMh9nkVat5cq/fVlX43HtckmKZzql/ZTdT/hGb76zykYmuXo1bdLWWXfdtFUIJoN6m97S42x9uehwuDkEtp+zK6uET5/FaL7QWM3R25Rv+Pn+m/vYnTHKJy/yMK2Iy6cawQ2grGXurNtpLDsF8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=Gmvi1V8G; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=k27+SP4+; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=A21FoqMN; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=5AhaRVvX; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="Gmvi1V8G"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="k27+SP4+"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="A21FoqMN"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="5AhaRVvX" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 288A01F381; Thu, 6 Feb 2025 05:47:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820880; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gp0yvQZM52mSZiy5KRCGCyZJi/VYmp29uzHrGPUicVM=; b=Gmvi1V8Gi9T3VHZtuepi9KDpxKp403SrAneN/OG19Okg+UCG36YK+w3g4YVLv0HNc0uePk y8yn2P6EgtuJsunDf+eH60A7U8PyWtEhbSojjV+GSyv2itXFq3cfNTvZm/ITK0AD+ISNFc 4IYIWTzPvWtr/ihIOwvLckujbt0sudA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820880; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gp0yvQZM52mSZiy5KRCGCyZJi/VYmp29uzHrGPUicVM=; b=k27+SP4+IhZudppgZmpoRETTCKG7D2j12R1I7CbW4mAUUktYwpTpcKhd9r8bc0iLiz5tqF 0dzfFA3/1yhmOAAw== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=A21FoqMN; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=5AhaRVvX DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820879; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gp0yvQZM52mSZiy5KRCGCyZJi/VYmp29uzHrGPUicVM=; b=A21FoqMNvTDoUB4C4831tj2CZdF8o+cdPjV+33r/mqcS/5ro49/1QtKOkc2P8rsQNdzo+5 horeCsiDCLlwgLRqEivUod3l2gadZurXF5u/i0qlgLyqNjNeN5qP58l7PwNKEHMM2yO7wh ArMebu6MZGpEBvfQrjesBgkcCJnwO9M= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820879; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gp0yvQZM52mSZiy5KRCGCyZJi/VYmp29uzHrGPUicVM=; b=5AhaRVvXtm1Ru0iiG2Knn3Y4HMGdUQNeDFM+ShiCfXqMdrfjZzyQCHJ3H2iJnpRuGaIF92 aybwwRqdQVIPItDQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 17EDD13795; Thu, 6 Feb 2025 05:47:55 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id CZhPLwtNpGf5BwAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:47:55 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 16/19] VFS: add lookup_and_lock_rename() Date: Thu, 6 Feb 2025 16:42:53 +1100 Message-ID: <20250206054504.2950516-17-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 288A01F381 X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCPT_COUNT_SEVEN(0.00)[8]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,suse.de:mid,suse.de:email,imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Level: lookup_and_lock_rename() combines locking and lookup for two names. It uses the new lock_two_directories_shared() if that is appropriate for the filesystem. unlock_rename_shared() does either a shared unlock or an exclusive unlock depending on how the filesystem wants rename to be handled. lookup_and_lock_rename_one() and done_lookup_and_lock_rename() are exported for other modules to use. As a rename can continue asynchronously after the inode lock is dropped, lock_two_directories() and lock_two_directories_shared() must ensure that is not happening before looking at ->d_parent. This requires a call to d_update_wait(). Note that is the dentry is locked for update it must be a rename. It cannot be a create or a (successful) rmdir as these dentries are not empty - except possibly the target directory, but waiting for the rmdir there is still needed of course. Signed-off-by: NeilBrown --- fs/namei.c | 230 +++++++++++++++++++++++++++++++++++------- include/linux/namei.h | 7 ++ 2 files changed, 199 insertions(+), 38 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index c7b7445c770e..771e9d7b620c 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3451,8 +3451,14 @@ static struct dentry *lock_two_directories(struct dentry *p1, struct dentry *p2) { struct dentry *p = p1, *q = p2, *r; - while ((r = p->d_parent) != p2 && r != p) + /* Ensure d_update_wait() tests are safe - one barrier for all */ + smp_mb(); + + d_update_wait(p, I_MUTEX_NORMAL); + while ((r = p->d_parent) != p2 && r != p) { p = r; + d_update_wait(p, I_MUTEX_NORMAL); + } if (r == p2) { // p is a child of p2 and an ancestor of p1 or p1 itself inode_lock_nested(p2->d_inode, I_MUTEX_PARENT); @@ -3461,8 +3467,11 @@ static struct dentry *lock_two_directories(struct dentry *p1, struct dentry *p2) } // p is the root of connected component that contains p1 // p2 does not occur on the path from p to p1 - while ((r = q->d_parent) != p1 && r != p && r != q) + d_update_wait(q, I_MUTEX_NORMAL); + while ((r = q->d_parent) != p1 && r != p && r != q) { q = r; + d_update_wait(q, I_MUTEX_NORMAL); + } if (r == p1) { // q is a child of p1 and an ancestor of p2 or p2 itself inode_lock_nested(p1->d_inode, I_MUTEX_PARENT); @@ -3479,6 +3488,46 @@ static struct dentry *lock_two_directories(struct dentry *p1, struct dentry *p2) } } +static struct dentry *lock_two_directories_shared(struct dentry *p1, struct dentry *p2) +{ + struct dentry *p = p1, *q = p2, *r; + + /* Ensure d_update_wait() tests are safe - one barrier for all */ + smp_mb(); + + d_update_wait(p1, I_MUTEX_NORMAL); + while ((r = p->d_parent) != p2 && r != p) { + p = r; + d_update_wait(p, I_MUTEX_NORMAL); + } + if (r == p2) { + // p is a child of p2 and an ancestor of p1 or p1 itself + inode_lock_shared_nested(p2->d_inode, I_MUTEX_PARENT); + inode_lock_shared_nested(p1->d_inode, I_MUTEX_PARENT2); + return p; + } + // p is the root of connected component that contains p1 + // p2 does not occur on the path from p to p1 + d_update_wait(q, I_MUTEX_NORMAL); + while ((r = q->d_parent) != p1 && r != p && r != q) { + q = r; + d_update_wait(q, I_MUTEX_NORMAL); + } + if (r == p1) { + // q is a child of p1 and an ancestor of p2 or p2 itself + inode_lock_shared_nested(p1->d_inode, I_MUTEX_PARENT); + inode_lock_shared_nested(p2->d_inode, I_MUTEX_PARENT2); + return q; + } else if (likely(r == p)) { + // both p2 and p1 are descendents of p + inode_lock_shared_nested(p1->d_inode, I_MUTEX_PARENT); + inode_lock_shared_nested(p2->d_inode, I_MUTEX_PARENT2); + return NULL; + } else { // no common ancestor at the time we'd been called + return ERR_PTR(-EXDEV); + } +} + /* * p1 and p2 should be directories on the same fs. */ @@ -3494,6 +3543,134 @@ struct dentry *lock_rename(struct dentry *p1, struct dentry *p2) } EXPORT_SYMBOL(lock_rename); +static void unlock_rename_shared(struct dentry *p1, struct dentry *p2) +{ + if (!(p1->d_inode->i_flags & S_ASYNC_RENAME)) + unlock_rename(p1, p2); + else { + inode_unlock_shared(p1->d_inode); + if (p1 != p2) { + inode_unlock_shared(p2->d_inode); + mutex_unlock(&p1->d_sb->s_vfs_rename_mutex); + } + } +} + +static int +lookup_and_lock_rename(struct dentry *p1, struct dentry *p2, + struct dentry **d1p, struct dentry **d2p, + struct qstr *last1, struct qstr *last2, + unsigned int flags1, unsigned int flags2) +{ + struct dentry *p = NULL; + struct dentry *d1, *d2; + bool ok1, ok2; + + if (p1->d_inode->i_flags & S_ASYNC_RENAME) { + if (p1 == p2) { + /* same parent - only one parent lock needed and + * no s_vfs_rename_mutex */ + inode_lock_shared_nested(p1->d_inode, I_MUTEX_PARENT); + } else { + mutex_lock(&p1->d_sb->s_vfs_rename_mutex); + + p = lock_two_directories_shared(p1, p2); + if (IS_ERR(p)) { + mutex_unlock(&p1->d_sb->s_vfs_rename_mutex); + return PTR_ERR(p); + } + } + } else + lock_rename(p1, p2); +retry: + d1 = lookup_one_qstr(last1, p1, flags1); + if (IS_ERR(d1)) + goto out_unlock_1; + d2 = lookup_one_qstr(last2, p2, flags2); + if (IS_ERR(d2)) + goto out_unlock_2; + + if (d1 == p) { + dput(d1); dput(d2); + unlock_rename_shared(p1, p2); + if (flags1 & LOOKUP_CREATE) + return -EINVAL; + else + return -ENOTEMPTY; + } + + if (d2 == p) { + dput(d1); dput(d2); + unlock_rename_shared(p1, p2); + if (flags2 & LOOKUP_CREATE) + return -EINVAL; + else + return -ENOTEMPTY; + } + + if (d1 < d2) { + ok1 = d_update_lock(d1, p1, last1, I_MUTEX_PARENT); + ok2 = d_update_lock(d2, p2, last2, I_MUTEX_PARENT2); + } else if (d1 > d2) { + ok2 = d_update_lock(d2, p2, last2, I_MUTEX_PARENT); + ok1 = d_update_lock(d1, p1, last1, I_MUTEX_PARENT2); + } else { + ok1 = ok2 = d_update_lock(d1, p1, last1, I_MUTEX_PARENT); + } + if (!ok1 || !ok2) { + if (ok1) + d_update_unlock(d1); + if (ok2 && d2 != d1) + d_update_unlock(d2); + dput(d1); + dput(d2); + goto retry; + } + *d1p = d1; *d2p = d2; + return 0; + +out_unlock_2: + dput(d1); + d1 = d2; +out_unlock_1: + unlock_rename_shared(p1, p2); + return PTR_ERR(d1); +} + +int lookup_and_lock_rename_one(struct dentry *p1, struct dentry *p2, + struct dentry **d1p, struct dentry **d2p, + const char *name1, int nlen1, + const char *name2, int nlen2, + unsigned int flags1, unsigned int flags2) +{ + struct qstr this1, this2; + int err; + + err = lookup_one_common(&nop_mnt_idmap, name1, p1, nlen1, &this1); + if (err) + return err; + err = lookup_one_common(&nop_mnt_idmap, name2, p2, nlen2, &this2); + if (err) + return err; + return lookup_and_lock_rename(p1, p2, d1p, d2p, &this1, &this2, + flags1, flags2); +} +EXPORT_SYMBOL(lookup_and_lock_rename_one); + +void done_lookup_and_lock_rename(struct dentry *p1, struct dentry *p2, + struct dentry *d1, struct dentry *d2) +{ + d_lookup_done(d1); + d_lookup_done(d2); + d_update_unlock(d1); + if (d2 != d1) + d_update_unlock(d2); + unlock_rename_shared(p1, p2); + dput(d1); + dput(d2); +} +EXPORT_SYMBOL(done_lookup_and_lock_rename); + /* * c1 and p2 should be on the same fs. */ @@ -5497,7 +5674,6 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, { struct renamedata rd; struct dentry *old_dentry, *new_dentry; - struct dentry *trap; struct path old_path, new_path; struct qstr old_last, new_last; int old_type, new_type; @@ -5548,51 +5724,33 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, goto exit2; retry_deleg: - trap = lock_rename(new_path.dentry, old_path.dentry); - if (IS_ERR(trap)) { - error = PTR_ERR(trap); + error = lookup_and_lock_rename(old_path.dentry, new_path.dentry, + &old_dentry, &new_dentry, + &old_last, &new_last, + lookup_flags, lookup_flags | target_flags); + if (error) goto exit_lock_rename; - } - old_dentry = lookup_one_qstr(&old_last, old_path.dentry, - lookup_flags); - error = PTR_ERR(old_dentry); - if (IS_ERR(old_dentry)) - goto exit3; - new_dentry = lookup_one_qstr(&new_last, new_path.dentry, - lookup_flags | target_flags); - error = PTR_ERR(new_dentry); - if (IS_ERR(new_dentry)) - goto exit4; if (flags & RENAME_EXCHANGE) { if (!d_is_dir(new_dentry)) { error = -ENOTDIR; if (new_last.name[new_last.len]) - goto exit5; + goto exit_unlock; } } /* unless the source is a directory trailing slashes give -ENOTDIR */ if (!d_is_dir(old_dentry)) { error = -ENOTDIR; if (old_last.name[old_last.len]) - goto exit5; + goto exit_unlock; if (!(flags & RENAME_EXCHANGE) && new_last.name[new_last.len]) - goto exit5; - } - /* source should not be ancestor of target */ - error = -EINVAL; - if (old_dentry == trap) - goto exit5; - /* target should not be an ancestor of source */ - if (!(flags & RENAME_EXCHANGE)) - error = -ENOTEMPTY; - if (new_dentry == trap) - goto exit5; + goto exit_unlock; + } error = security_path_rename(&old_path, old_dentry, &new_path, new_dentry, flags); if (error) - goto exit5; + goto exit_unlock; rd.old_dir = old_path.dentry->d_inode; rd.old_dentry = old_dentry; @@ -5603,13 +5761,9 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, rd.delegated_inode = &delegated_inode; rd.flags = flags; error = vfs_rename(&rd); -exit5: - d_lookup_done(new_dentry); - dput(new_dentry); -exit4: - dput(old_dentry); -exit3: - unlock_rename(new_path.dentry, old_path.dentry); +exit_unlock: + done_lookup_and_lock_rename(new_path.dentry, old_path.dentry, + new_dentry, old_dentry); exit_lock_rename: if (delegated_inode) { error = break_deleg_wait(&delegated_inode); diff --git a/include/linux/namei.h b/include/linux/namei.h index 72e351640406..8ef7aa6ed64c 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -104,6 +104,13 @@ extern int follow_up(struct path *); extern struct dentry *lock_rename(struct dentry *, struct dentry *); extern struct dentry *lock_rename_child(struct dentry *, struct dentry *); extern void unlock_rename(struct dentry *, struct dentry *); +int lookup_and_lock_rename_one(struct dentry *p1, struct dentry *p2, + struct dentry **d1p, struct dentry **d2p, + const char *name1, int nlen1, + const char *name2, int nlen2, + unsigned int flags1, unsigned int flags2); +void done_lookup_and_lock_rename(struct dentry *p1, struct dentry *p2, + struct dentry *d1, struct dentry *d2); /** * mode_strip_umask - handle vfs umask stripping From patchwork Thu Feb 6 05:42:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962257 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3EEF42253E1; Thu, 6 Feb 2025 05:48:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820889; cv=none; b=YIT+Npc/NeYv+aF5OlCOKTgRXxrQmVs8IS5M6r2R0rVROVsGnXxIvJLwsOBHA/eSiH227OGwANfdDBZG5AA9vke9BX5nZp78x3GoJyTyjE5HRjpG+65RKPjScP7TOz7zgXuiXT+EfBocs8YCo3mSoYuxxGaRq8XqchBN0TMxPSI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820889; c=relaxed/simple; bh=Doh1fi3lk7dyxRVk4PxHj8QN0P0CPQ4ijx82ja3lDFU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eqjkQ3etuEF/rZouxElFXJfj5TQ3Nz5IiQDl5Z2Tc8d0LAy7Ti0HlKOYOGzNdJ17USIycsNFb8j+yMjc/Cc5NI/N1vvOI1K/WAPE5WjA9cRSFnarg8hYIi1HR7A/ey+0kbG2LytV8cAUc57XJnONGre2elNcziK748z58yyoQV4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=r5C0MKQM; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=ceHtDF+g; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=r5C0MKQM; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=ceHtDF+g; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="r5C0MKQM"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="ceHtDF+g"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="r5C0MKQM"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="ceHtDF+g" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 643B61F381; Thu, 6 Feb 2025 05:48:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820885; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nzJzvVSPDZKlnRYA1D5bBmX9O8UlzUVlfs44jkMjMDU=; b=r5C0MKQMo0q15d8mBnQh4wzPZm9l7XYY+hdVZUyOWnAv+NdcuhAA8NJ8N2YLsEzmHGgJzs XpEGcbqmD86LWdh7gJO6jzfi+FBWN/UDId6TED5QBk5YxU4Pk0PmEYXoj6R78XBbyPy68u 3TdfX/NHUEAtcQYANMsiObxoxpZCgIE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820885; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nzJzvVSPDZKlnRYA1D5bBmX9O8UlzUVlfs44jkMjMDU=; b=ceHtDF+gXEs6hqVqFFr/dv1cZtqedHnRkVde8mWEgVGzOmynyQmY+O5XlOh7p4YeSb3Ybf UGB7MsEp9Kig7LCg== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=r5C0MKQM; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=ceHtDF+g DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820885; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nzJzvVSPDZKlnRYA1D5bBmX9O8UlzUVlfs44jkMjMDU=; b=r5C0MKQMo0q15d8mBnQh4wzPZm9l7XYY+hdVZUyOWnAv+NdcuhAA8NJ8N2YLsEzmHGgJzs XpEGcbqmD86LWdh7gJO6jzfi+FBWN/UDId6TED5QBk5YxU4Pk0PmEYXoj6R78XBbyPy68u 3TdfX/NHUEAtcQYANMsiObxoxpZCgIE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820885; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nzJzvVSPDZKlnRYA1D5bBmX9O8UlzUVlfs44jkMjMDU=; b=ceHtDF+gXEs6hqVqFFr/dv1cZtqedHnRkVde8mWEgVGzOmynyQmY+O5XlOh7p4YeSb3Ybf UGB7MsEp9Kig7LCg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id A170D13795; Thu, 6 Feb 2025 05:48:02 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id DntlFRJNpGcCCAAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:48:02 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 17/19] nfsd: use lookup_and_lock_one() and lookup_and_lock_rename_one() Date: Thu, 6 Feb 2025 16:42:54 +1100 Message-ID: <20250206054504.2950516-18-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 643B61F381 X-Spam-Level: X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCPT_COUNT_SEVEN(0.00)[8]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,imap1.dmz-prg2.suse.org:rdns,suse.de:email,suse.de:dkim,suse.de:mid] X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -3.01 X-Spam-Flag: NO nfsd now used lookup_and_lock_one() when creating/removing names in the exported filesystem. It uses lookup_and_lock_rename_one() when renaming. Signed-off-by: NeilBrown --- fs/nfsd/nfsproc.c | 12 +++--- fs/nfsd/vfs.c | 107 +++++++++++++--------------------------------- 2 files changed, 36 insertions(+), 83 deletions(-) diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c index 6dda081eb24c..27c2b1d5e1ac 100644 --- a/fs/nfsd/nfsproc.c +++ b/fs/nfsd/nfsproc.c @@ -311,17 +311,16 @@ nfsd_proc_create(struct svc_rqst *rqstp) goto done; } - inode_lock_nested(dirfhp->fh_dentry->d_inode, I_MUTEX_PARENT); - dchild = lookup_one_len(argp->name, dirfhp->fh_dentry, argp->len); + dchild = lookup_and_lock_one(NULL, argp->name, argp->len, + dirfhp->fh_dentry, LOOKUP_CREATE); if (IS_ERR(dchild)) { resp->status = nfserrno(PTR_ERR(dchild)); - goto out_unlock; + goto put_write; } fh_init(newfhp, NFS_FHSIZE); resp->status = fh_compose(newfhp, dirfhp->fh_export, dchild, dirfhp); if (!resp->status && d_really_is_negative(dchild)) resp->status = nfserr_noent; - dput(dchild); if (resp->status) { if (resp->status != nfserr_noent) goto out_unlock; @@ -331,7 +330,7 @@ nfsd_proc_create(struct svc_rqst *rqstp) */ resp->status = nfserr_acces; if (!newfhp->fh_dentry) { - printk(KERN_WARNING + printk(KERN_WARNING "nfsd_proc_create: file handle not verified\n"); goto out_unlock; } @@ -427,7 +426,8 @@ nfsd_proc_create(struct svc_rqst *rqstp) } out_unlock: - inode_unlock(dirfhp->fh_dentry->d_inode); + done_lookup_and_lock(dirfhp->fh_dentry, dchild, LOOKUP_CREATE); +put_write: fh_drop_write(dirfhp); done: fh_put(dirfhp); diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 740332413138..af4a7f75cca0 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1551,19 +1551,13 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp, if (host_err) return nfserrno(host_err); - inode_lock_nested(dentry->d_inode, I_MUTEX_PARENT); - dchild = lookup_one_len(fname, dentry, flen); + dchild = lookup_and_lock_one(NULL, fname, flen, dentry, LOOKUP_CREATE); host_err = PTR_ERR(dchild); if (IS_ERR(dchild)) { err = nfserrno(host_err); - goto out_unlock; + goto out; } err = fh_compose(resfhp, fhp->fh_export, dchild, fhp); - /* - * We unconditionally drop our ref to dchild as fh_compose will have - * already grabbed its own ref for it. - */ - dput(dchild); if (err) goto out_unlock; err = fh_fill_pre_attrs(fhp); @@ -1572,7 +1566,8 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp, err = nfsd_create_locked(rqstp, fhp, attrs, type, rdev, resfhp); fh_fill_post_attrs(fhp); out_unlock: - inode_unlock(dentry->d_inode); + done_lookup_and_lock(dentry, dchild, LOOKUP_CREATE); +out: return err; } @@ -1656,8 +1651,7 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp, } dentry = fhp->fh_dentry; - inode_lock_nested(dentry->d_inode, I_MUTEX_PARENT); - dnew = lookup_one_len(fname, dentry, flen); + dnew = lookup_and_lock_one(NULL, fname, flen, dentry, LOOKUP_CREATE); if (IS_ERR(dnew)) { err = nfserrno(PTR_ERR(dnew)); inode_unlock(dentry->d_inode); @@ -1673,11 +1667,11 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp, nfsd_create_setattr(rqstp, fhp, resfhp, attrs); fh_fill_post_attrs(fhp); out_unlock: - inode_unlock(dentry->d_inode); + done_lookup_and_lock(dentry, dnew, LOOKUP_CREATE); if (!err) err = nfserrno(commit_metadata(fhp)); - dput(dnew); - if (err==0) err = cerr; + if (err==0) + err = cerr; out_drop_write: fh_drop_write(fhp); out: @@ -1721,43 +1715,35 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp, ddir = ffhp->fh_dentry; dirp = d_inode(ddir); - inode_lock_nested(dirp, I_MUTEX_PARENT); - - dnew = lookup_one_len(name, ddir, len); + dnew = lookup_and_lock_one(NULL, name, len, ddir, LOOKUP_CREATE); if (IS_ERR(dnew)) { - err = nfserrno(PTR_ERR(dnew)); - goto out_unlock; + err = PTR_ERR(dnew); + goto out_drop_write; } dold = tfhp->fh_dentry; err = nfserr_noent; if (d_really_is_negative(dold)) - goto out_dput; + goto out_unlock; err = fh_fill_pre_attrs(ffhp); if (err != nfs_ok) - goto out_dput; + goto out_unlock; host_err = vfs_link(dold, &nop_mnt_idmap, dirp, dnew, NULL); fh_fill_post_attrs(ffhp); - inode_unlock(dirp); - if (!host_err) { +out_unlock: + done_lookup_and_lock(ddir, dnew, LOOKUP_CREATE); + if (!err && !host_err) { err = nfserrno(commit_metadata(ffhp)); if (!err) err = nfserrno(commit_metadata(tfhp)); - } else { + } else if (!err) { err = nfserrno(host_err); } - dput(dnew); out_drop_write: fh_drop_write(tfhp); out: return err; - -out_dput: - dput(dnew); -out_unlock: - inode_unlock(dirp); - goto out_drop_write; } static void @@ -1788,7 +1774,7 @@ __be32 nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen, struct svc_fh *tfhp, char *tname, int tlen) { - struct dentry *fdentry, *tdentry, *odentry, *ndentry, *trap; + struct dentry *fdentry, *tdentry, *odentry, *ndentry; struct inode *fdir, *tdir; __be32 err; int host_err; @@ -1824,9 +1810,12 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen, goto out; } - trap = lock_rename(tdentry, fdentry); - if (IS_ERR(trap)) { - err = nfserr_xdev; + host_err = lookup_and_lock_rename_one(fdentry, tdentry, + &odentry, &ndentry, + fname, flen, tname, tlen, + 0, LOOKUP_CREATE|LOOKUP_RENAME_TARGET); + if (host_err) { + err = nfserrno(host_err); goto out_want_write; } err = fh_fill_pre_attrs(ffhp); @@ -1836,30 +1825,10 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen, if (err != nfs_ok) goto out_unlock; - odentry = lookup_one_len(fname, fdentry, flen); - host_err = PTR_ERR(odentry); - if (IS_ERR(odentry)) - goto out_nfserr; - - host_err = -ENOENT; - if (d_really_is_negative(odentry)) - goto out_dput_old; - host_err = -EINVAL; - if (odentry == trap) - goto out_dput_old; - - ndentry = lookup_one_len(tname, tdentry, tlen); - host_err = PTR_ERR(ndentry); - if (IS_ERR(ndentry)) - goto out_dput_old; - host_err = -ENOTEMPTY; - if (ndentry == trap) - goto out_dput_new; - if ((ndentry->d_sb->s_export_op->flags & EXPORT_OP_CLOSE_BEFORE_UNLINK) && nfsd_has_cached_files(ndentry)) { close_cached = true; - goto out_dput_old; + goto out_unlock; } else { struct renamedata rd = { .old_mnt_idmap = &nop_mnt_idmap, @@ -1884,11 +1853,6 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen, host_err = commit_metadata(ffhp); } } - out_dput_new: - dput(ndentry); - out_dput_old: - dput(odentry); - out_nfserr: err = nfserrno(host_err); if (!close_cached) { @@ -1896,7 +1860,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen, fh_fill_post_attrs(tfhp); } out_unlock: - unlock_rename(tdentry, fdentry); + done_lookup_and_lock_rename(fdentry, tdentry, odentry, ndentry); out_want_write: fh_drop_write(ffhp); @@ -1943,18 +1907,11 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, dentry = fhp->fh_dentry; dirp = d_inode(dentry); - inode_lock_nested(dirp, I_MUTEX_PARENT); - - rdentry = lookup_one_len(fname, dentry, flen); + rdentry = lookup_and_lock_one(NULL, fname, flen, dentry, LOOKUP_REMOVE); host_err = PTR_ERR(rdentry); if (IS_ERR(rdentry)) - goto out_unlock; + goto out_drop_write; - if (d_really_is_negative(rdentry)) { - dput(rdentry); - host_err = -ENOENT; - goto out_unlock; - } rinode = d_inode(rdentry); err = fh_fill_pre_attrs(fhp); if (err != nfs_ok) @@ -1981,11 +1938,10 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, host_err = vfs_rmdir(&nop_mnt_idmap, dirp, rdentry); } fh_fill_post_attrs(fhp); - - inode_unlock(dirp); +out_unlock: + done_lookup_and_lock(dentry, rdentry, LOOKUP_REMOVE); if (!host_err) host_err = commit_metadata(fhp); - dput(rdentry); iput(rinode); /* truncate the inode here */ out_drop_write: @@ -2001,9 +1957,6 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, } out: return err; -out_unlock: - inode_unlock(dirp); - goto out_drop_write; } /* From patchwork Thu Feb 6 05:42:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962258 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D3EE224B12; Thu, 6 Feb 2025 05:48:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820899; cv=none; b=HuiGkCTSEzeco0Pr+7bMiAhzLpNL/IPFJ/uAhvDpa6696ofPEoRGFDmeIZnCAiTOUW3WZKnNGftXFtWJ3StoX3TmqrJplqdIjyobkUTpVqV7u8i1xr6xMVyvTvlfKFt3HrqA5S3sgqHQZEpSmnHIvAqxHd17rmC9hjJIWwQuCtw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820899; c=relaxed/simple; bh=yqPLlIyNRSWzQcBc2rqx2rU9e/BkDUK/oLfSSTjFzek=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JpKCOO22TwvhSZT7Euu69LmEkYuEvIm/jKsEVXOiIiN4qObV7q4stgCovZxHfyLkJG5+CnVt7PBigiFCyGactjZMNxX3SsXrhlol3FxASe+0taYw+CW+QyMHmOgDsgEs/GR/QEAhHrFawoTMXU4M2vJtnEgYmGlYz1iJUrXRUE8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=oB1aKeOZ; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=iIqrC7xF; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=oB1aKeOZ; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=iIqrC7xF; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="oB1aKeOZ"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="iIqrC7xF"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="oB1aKeOZ"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="iIqrC7xF" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id A9A1821108; Thu, 6 Feb 2025 05:48:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820895; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oVe7gYkowi0dGd3vy2szPeCKItpohkL7PZISomngb4A=; b=oB1aKeOZ3N4SlUKIXXMCmzCYmK24kMqwPV7m7t9Impiji4KYUlOrn/HkQoJbw7irl4nwit ktGdGJhTz/tsB7D9XGME+xcWM5VoFKfePKnhrvINfhPYSEZ8Wen+YnY4eQdE0rbDKniKgO 6cr/jJAPXeeoms3FJrsXdzOMiyhRrxQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820895; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oVe7gYkowi0dGd3vy2szPeCKItpohkL7PZISomngb4A=; b=iIqrC7xFtHajo8zueilFdg2SDw9w8MQdkt+MQ2PiahXbOg7C/S5PAXsXwNis+zFdix+XrL c2A4m3xj1pwoDyCQ== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=oB1aKeOZ; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=iIqrC7xF DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820895; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oVe7gYkowi0dGd3vy2szPeCKItpohkL7PZISomngb4A=; b=oB1aKeOZ3N4SlUKIXXMCmzCYmK24kMqwPV7m7t9Impiji4KYUlOrn/HkQoJbw7irl4nwit ktGdGJhTz/tsB7D9XGME+xcWM5VoFKfePKnhrvINfhPYSEZ8Wen+YnY4eQdE0rbDKniKgO 6cr/jJAPXeeoms3FJrsXdzOMiyhRrxQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820895; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oVe7gYkowi0dGd3vy2szPeCKItpohkL7PZISomngb4A=; b=iIqrC7xFtHajo8zueilFdg2SDw9w8MQdkt+MQ2PiahXbOg7C/S5PAXsXwNis+zFdix+XrL c2A4m3xj1pwoDyCQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id E59F813795; Thu, 6 Feb 2025 05:48:12 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id n1gUJhxNpGcKCAAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:48:12 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 18/19] nfs: change mkdir inode_operation to mkdir_async Date: Thu, 6 Feb 2025 16:42:55 +1100 Message-ID: <20250206054504.2950516-19-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: A9A1821108 X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCPT_COUNT_SEVEN(0.00)[8]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,suse.de:mid,suse.de:email,imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Level: mkdir_async allows a different dentry to be returned which is sometimes relevant for nfs. This patch changes the nfs_rpc_ops mkdir op to return a dentry, and passes that back to the caller using mkdir_async. Signed-off-by: NeilBrown --- fs/nfs/dir.c | 17 ++++++++-------- fs/nfs/internal.h | 4 ++-- fs/nfs/nfs3proc.c | 9 +++++---- fs/nfs/nfs4proc.c | 45 +++++++++++++++++++++++++++++------------ fs/nfs/proc.c | 14 ++++++++----- include/linux/nfs_xdr.h | 2 +- 6 files changed, 58 insertions(+), 33 deletions(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 8cbe63f4089a..2c69ec77d02c 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -2420,11 +2420,12 @@ EXPORT_SYMBOL_GPL(nfs_mknod); /* * See comments for nfs_proc_create regarding failed operations. */ -int nfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode) +struct dentry *nfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode, + struct dirop_ret *dret) { struct iattr attr; - int error; + struct dentry *ret; dfprintk(VFS, "NFS: mkdir(%s/%lu), %pd\n", dir->i_sb->s_id, dir->i_ino, dentry); @@ -2433,14 +2434,14 @@ int nfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, attr.ia_mode = mode | S_IFDIR; trace_nfs_mkdir_enter(dir, dentry); - error = NFS_PROTO(dir)->mkdir(dir, dentry, &attr); - trace_nfs_mkdir_exit(dir, dentry, error); - if (error != 0) + ret = NFS_PROTO(dir)->mkdir(dir, dentry, &attr); + trace_nfs_mkdir_exit(dir, dentry, PTR_ERR_OR_ZERO(ret)); + if (IS_ERR(ret)) goto out_err; - return 0; + return ret; out_err: d_drop(dentry); - return error; + return ret; } EXPORT_SYMBOL_GPL(nfs_mkdir); diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index fae2c7ae4acc..f7dea7fe5ebc 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -400,8 +400,8 @@ struct dentry *nfs_lookup(struct inode *, struct dentry *, unsigned int); void nfs_d_prune_case_insensitive_aliases(struct inode *inode); int nfs_create(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, bool); -int nfs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, - umode_t); +struct dentry *nfs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, + umode_t, struct dirop_ret *); int nfs_rmdir(struct inode *, struct dentry *); int nfs_unlink(struct inode *, struct dentry *); int nfs_symlink(struct mnt_idmap *, struct inode *, struct dentry *, diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c index 0c3bc98cd999..41797cbbb8dc 100644 --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -578,7 +578,7 @@ nfs3_proc_symlink(struct inode *dir, struct dentry *dentry, struct folio *folio, return status; } -static int +static struct dentry * nfs3_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) { struct posix_acl *default_acl, *acl; @@ -613,14 +613,15 @@ nfs3_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) status = nfs3_proc_setacls(d_inode(dentry), acl, default_acl); - dput(d_alias); out_release_acls: posix_acl_release(acl); posix_acl_release(default_acl); out: nfs3_free_createdata(data); dprintk("NFS reply mkdir: %d\n", status); - return status; + if (status) + return ERR_PTR(status); + return d_alias; } static int @@ -1037,7 +1038,7 @@ static const struct inode_operations nfs3_dir_inode_operations = { .link = nfs_link, .unlink = nfs_unlink, .symlink = nfs_symlink, - .mkdir = nfs_mkdir, + .mkdir_async = nfs_mkdir, .rmdir = nfs_rmdir, .mknod = nfs_mknod, .rename = nfs_rename, diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index df9669d4ded7..ef219968ed22 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -5135,9 +5135,6 @@ static int nfs4_do_create(struct inode *dir, struct dentry *dentry, struct nfs4_ &data->arg.seq_args, &data->res.seq_res, 1); if (status == 0) { spin_lock(&dir->i_lock); - /* Creating a directory bumps nlink in the parent */ - if (data->arg.ftype == NF4DIR) - nfs4_inc_nlink_locked(dir); nfs4_update_changeattr_locked(dir, &data->res.dir_cinfo, data->res.fattr->time_start, NFS_INO_INVALID_DATA); @@ -5147,6 +5144,25 @@ static int nfs4_do_create(struct inode *dir, struct dentry *dentry, struct nfs4_ return status; } +static struct dentry *nfs4_do_mkdir(struct inode *dir, struct dentry *dentry, + struct nfs4_createdata *data) +{ + int status = nfs4_call_sync(NFS_SERVER(dir)->client, NFS_SERVER(dir), &data->msg, + &data->arg.seq_args, &data->res.seq_res, 1); + + if (status) + return ERR_PTR(status); + + spin_lock(&dir->i_lock); + /* Creating a directory bumps nlink in the parent */ + nfs4_inc_nlink_locked(dir); + nfs4_update_changeattr_locked(dir, &data->res.dir_cinfo, + data->res.fattr->time_start, + NFS_INO_INVALID_DATA); + spin_unlock(&dir->i_lock); + return nfs_add_or_obtain(dentry, data->res.fh, data->res.fattr); +} + static void nfs4_free_createdata(struct nfs4_createdata *data) { nfs4_label_free(data->fattr.label); @@ -5203,32 +5219,34 @@ static int nfs4_proc_symlink(struct inode *dir, struct dentry *dentry, return err; } -static int _nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry, - struct iattr *sattr, struct nfs4_label *label) +static struct dentry *_nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry, + struct iattr *sattr, + struct nfs4_label *label) { struct nfs4_createdata *data; - int status = -ENOMEM; + struct dentry *ret = ERR_PTR(-ENOMEM); data = nfs4_alloc_createdata(dir, &dentry->d_name, sattr, NF4DIR); if (data == NULL) goto out; data->arg.label = label; - status = nfs4_do_create(dir, dentry, data); + ret = nfs4_do_mkdir(dir, dentry, data); nfs4_free_createdata(data); out: - return status; + return ret; } -static int nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry, - struct iattr *sattr) +static struct dentry *nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry, + struct iattr *sattr) { struct nfs_server *server = NFS_SERVER(dir); struct nfs4_exception exception = { .interruptible = true, }; struct nfs4_label l, *label; + struct dentry *alias; int err; label = nfs4_label_init_security(dir, dentry, sattr, &l); @@ -5236,14 +5254,15 @@ static int nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry, if (!(server->attr_bitmask[2] & FATTR4_WORD2_MODE_UMASK)) sattr->ia_mode &= ~current_umask(); do { - err = _nfs4_proc_mkdir(dir, dentry, sattr, label); + alias = _nfs4_proc_mkdir(dir, dentry, sattr, label); + err = PTR_ERR_OR_ZERO(alias); trace_nfs4_mkdir(dir, &dentry->d_name, err); err = nfs4_handle_exception(NFS_SERVER(dir), err, &exception); } while (exception.retry); nfs4_label_release_security(label); - return err; + return alias; } static int _nfs4_proc_readdir(struct nfs_readdir_arg *nr_arg, @@ -10865,7 +10884,7 @@ static const struct inode_operations nfs4_dir_inode_operations = { .link = nfs_link, .unlink = nfs_unlink, .symlink = nfs_symlink, - .mkdir = nfs_mkdir, + .mkdir_async = nfs_mkdir, .rmdir = nfs_rmdir, .mknod = nfs_mknod, .rename = nfs_rename, diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c index 77920a2e3cef..7e8f6d8f02b4 100644 --- a/fs/nfs/proc.c +++ b/fs/nfs/proc.c @@ -446,13 +446,14 @@ nfs_proc_symlink(struct inode *dir, struct dentry *dentry, struct folio *folio, return status; } -static int +static struct dentry * nfs_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) { struct nfs_createdata *data; struct rpc_message msg = { .rpc_proc = &nfs_procedures[NFSPROC_MKDIR], }; + struct dentry *alias = NULL; int status = -ENOMEM; dprintk("NFS call mkdir %pd\n", dentry); @@ -464,12 +465,15 @@ nfs_proc_mkdir(struct inode *dir, struct dentry *dentry, struct iattr *sattr) status = rpc_call_sync(NFS_CLIENT(dir), &msg, 0); nfs_mark_for_revalidate(dir); - if (status == 0) - status = nfs_instantiate(dentry, data->res.fh, data->res.fattr); + if (status == 0) { + alias = nfs_add_or_obtain(dentry, data->res.fh, data->res.fattr); + status = PTR_ERR_OR_ZERO(alias); + } else + alias = ERR_PTR(status); nfs_free_createdata(data); out: dprintk("NFS reply mkdir: %d\n", status); - return status; + return alias; } static int @@ -706,7 +710,7 @@ static const struct inode_operations nfs_dir_inode_operations = { .link = nfs_link, .unlink = nfs_unlink, .symlink = nfs_symlink, - .mkdir = nfs_mkdir, + .mkdir_async = nfs_mkdir, .rmdir = nfs_rmdir, .mknod = nfs_mknod, .rename = nfs_rename, diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index d0473e0d4aba..33d7f4c8183e 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -1801,7 +1801,7 @@ struct nfs_rpc_ops { int (*link) (struct inode *, struct inode *, const struct qstr *); int (*symlink) (struct inode *, struct dentry *, struct folio *, unsigned int, struct iattr *); - int (*mkdir) (struct inode *, struct dentry *, struct iattr *); + struct dentry *(*mkdir) (struct inode *, struct dentry *, struct iattr *); int (*rmdir) (struct inode *, const struct qstr *); int (*readdir) (struct nfs_readdir_arg *, struct nfs_readdir_res *); int (*mknod) (struct inode *, struct dentry *, struct iattr *, From patchwork Thu Feb 6 05:42:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13962259 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16ED1226162; Thu, 6 Feb 2025 05:48:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820906; cv=none; b=Qpb/4mdiE8xh3AUdRJUYKil31wQEya9qp8nIUsyeKV2f0lVrIy6zY2MLz8dnHFJo2o4X6JlkSpxCt0+cWIRvWfRc9y+fimDTaQvR3IFrdp3HOSqkfYORgmgJwlVdttfzeJ6DRNnBu9FoP4gA7CeHNp5zzSWr9giD1zzSnCqqhUA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738820906; c=relaxed/simple; bh=dJB9W38a5aAx31v28nz34+/bjN6Z1cb6nXVPJ83bCf8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IMpZUXLiOkjRGxUbyE7ThHz8vvFdqUBymTadByTh9evOmJNWZPaQG9PsMC8uoFed512xmfPLHUy6Hq+UtqRWzfuO/BVRy7JDJH/ZbVpH+3husHcu8Ir79IijN7qV9vUIlFtDYVKfBFx8We1xMMB4llJ9Vs2yZKdenfs4DbiF6ZY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=J2KRFxQM; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=zahueyYR; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=J2KRFxQM; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=zahueyYR; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="J2KRFxQM"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="zahueyYR"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="J2KRFxQM"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="zahueyYR" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 51B6521108; Thu, 6 Feb 2025 05:48:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820902; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FyPZVjbuo93GcMVwtHE8Q8fOTDh5z3A5/51maHi8lE8=; b=J2KRFxQMhTZxfH6OkHVz6gcDSEI+m75aOLA6BZBp09TQUt5Zu/zUYH2mTq9ejJEfZko7WB 51h2RMeGkrIfJHN4runP34JozXUaD9kgfRWENYjybs9bST09CY+IYnW/dEXVdZHTWyvWXj Jt2BQuC6EvgMFMXxNB21jaHJVLngq4Y= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820902; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FyPZVjbuo93GcMVwtHE8Q8fOTDh5z3A5/51maHi8lE8=; b=zahueyYR+ke83C3+XhLUIpKggIU1Brf8XGBHh/pQQkmsvpKwjut4AqX5lCXoys8NHgyp6r /I0qpRVwLw0P7ZAQ== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=J2KRFxQM; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=zahueyYR DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1738820902; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FyPZVjbuo93GcMVwtHE8Q8fOTDh5z3A5/51maHi8lE8=; b=J2KRFxQMhTZxfH6OkHVz6gcDSEI+m75aOLA6BZBp09TQUt5Zu/zUYH2mTq9ejJEfZko7WB 51h2RMeGkrIfJHN4runP34JozXUaD9kgfRWENYjybs9bST09CY+IYnW/dEXVdZHTWyvWXj Jt2BQuC6EvgMFMXxNB21jaHJVLngq4Y= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1738820902; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FyPZVjbuo93GcMVwtHE8Q8fOTDh5z3A5/51maHi8lE8=; b=zahueyYR+ke83C3+XhLUIpKggIU1Brf8XGBHh/pQQkmsvpKwjut4AqX5lCXoys8NHgyp6r /I0qpRVwLw0P7ZAQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 42FBC13795; Thu, 6 Feb 2025 05:48:18 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id MlvTOSJNpGcUCAAAD6G6ig (envelope-from ); Thu, 06 Feb 2025 05:48:18 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds , Jeff Layton , Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 19/19] nfs: switch to _async for all directory ops. Date: Thu, 6 Feb 2025 16:42:56 +1100 Message-ID: <20250206054504.2950516-20-neilb@suse.de> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250206054504.2950516-1-neilb@suse.de> References: <20250206054504.2950516-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 51B6521108 X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCPT_COUNT_SEVEN(0.00)[8]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,suse.de:mid,suse.de:email,imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Level: nfs doesn't benefit from exclusive locking by the VFS as all directory ops are sent to the server which does any needed locking. The interesting part is "silly-rename" which needs to create and lock another dentry while an unlink or rename is happening. nfs_sillyrename() now returns that locked dentry and nfs_sillyrename_finish() is added to unlock it when appropriate. In order to keep all dentries locked until the operation completes, nfs_sillyrename() now uses d_exchange() to record the silly rename in the dcache. This has to be exported and permitted to work on a negative second dentry. Signed-off-by: NeilBrown --- fs/dcache.c | 5 +++- fs/nfs/dir.c | 55 ++++++++++++++++++++++++------------------ fs/nfs/internal.h | 20 +++++++++------ fs/nfs/nfs3proc.c | 16 ++++++------ fs/nfs/nfs4_fs.h | 2 +- fs/nfs/nfs4proc.c | 16 ++++++------ fs/nfs/proc.c | 16 ++++++------ fs/nfs/unlink.c | 48 +++++++++++++++++++++++++----------- include/linux/namei.h | 1 - include/linux/nfs_fs.h | 3 --- 10 files changed, 106 insertions(+), 76 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 90dee859d138..203d71eb4789 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2981,7 +2981,9 @@ void d_exchange(struct dentry *dentry1, struct dentry *dentry2) write_seqlock(&rename_lock); WARN_ON(!dentry1->d_inode); - WARN_ON(!dentry2->d_inode); + /* allow dentry2 to be negative so we can do a rename but keep + * both names locked with DCACHE_PAR_UPDATE. + */ WARN_ON(IS_ROOT(dentry1)); WARN_ON(IS_ROOT(dentry2)); @@ -2989,6 +2991,7 @@ void d_exchange(struct dentry *dentry1, struct dentry *dentry2) write_sequnlock(&rename_lock); } +EXPORT_SYMBOL(d_exchange); /** * d_ancestor - search for an ancestor diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 2c69ec77d02c..c0116d44a6fc 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1956,10 +1956,14 @@ struct dentry *nfs_lookup(struct inode *dir, struct dentry * dentry, unsigned in return ERR_PTR(-ENAMETOOLONG); /* - * If we're doing an exclusive create, optimize away the lookup - * but don't hash the dentry. + * If we're doing an exclusive create, or if this is the target + * of a rename, optimize away the lookup but don't hash the dentry. + * A silly_rename is uniquely marked exclusive (REALLY? FIXME) and a rename target, + * sand it request and explicit lookup. */ - if (nfs_is_exclusive_create(dir, flags) || flags & LOOKUP_RENAME_TARGET) + if (nfs_is_exclusive_create(dir, flags) || (flags & LOOKUP_RENAME_TARGET && + ((flags & (LOOKUP_EXCL | LOOKUP_RENAME_TARGET)) != + (LOOKUP_EXCL | LOOKUP_RENAME_TARGET)))) return NULL; res = ERR_PTR(-ENOMEM); @@ -2057,7 +2061,7 @@ static int nfs_finish_open(struct nfs_open_context *ctx, int nfs_atomic_open(struct inode *dir, struct dentry *dentry, struct file *file, unsigned open_flags, - umode_t mode) + umode_t mode, struct dirop_ret *ret) { struct nfs_open_context *ctx; struct dentry *res; @@ -2256,7 +2260,7 @@ nfs4_lookup_revalidate(struct inode *dir, const struct qstr *name, int nfs_atomic_open_v23(struct inode *dir, struct dentry *dentry, struct file *file, unsigned int open_flags, - umode_t mode) + umode_t mode, struct dirop_ret *ret) { /* Same as look+open from lookup_open(), but with different O_TRUNC @@ -2383,7 +2387,8 @@ static int nfs_do_create(struct inode *dir, struct dentry *dentry, } int nfs_create(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode, bool excl) + struct dentry *dentry, umode_t mode, bool excl, + struct dirop_ret *ret) { return nfs_do_create(dir, dentry, mode, excl ? O_EXCL : 0); } @@ -2394,7 +2399,8 @@ EXPORT_SYMBOL_GPL(nfs_create); */ int nfs_mknod(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, umode_t mode, dev_t rdev) + struct dentry *dentry, umode_t mode, dev_t rdev, + struct dirop_ret *ret) { struct iattr attr; int status; @@ -2466,7 +2472,7 @@ static void nfs_dentry_remove_handle_error(struct inode *dir, } } -int nfs_rmdir(struct inode *dir, struct dentry *dentry) +int nfs_rmdir(struct inode *dir, struct dentry *dentry, struct dirop_ret *ret) { int error; @@ -2535,7 +2541,7 @@ static int nfs_safe_remove(struct dentry *dentry) * * If sillyrename() returns 0, we do nothing, otherwise we unlink. */ -int nfs_unlink(struct inode *dir, struct dentry *dentry) +int nfs_unlink(struct inode *dir, struct dentry *dentry, struct dirop_ret *ret) { int error; @@ -2546,10 +2552,14 @@ int nfs_unlink(struct inode *dir, struct dentry *dentry) spin_lock(&dentry->d_lock); if (d_count(dentry) > 1 && !test_bit(NFS_INO_PRESERVE_UNLINKED, &NFS_I(d_inode(dentry))->flags)) { + struct dentry *silly; + spin_unlock(&dentry->d_lock); /* Start asynchronous writeout of the inode */ write_inode_now(d_inode(dentry), 0); - error = nfs_sillyrename(dir, dentry); + silly = nfs_sillyrename(dir, dentry); + nfs_sillyrename_finish(silly); + error = PTR_ERR_OR_ZERO(silly); goto out; } /* We must prevent any concurrent open until the unlink @@ -2591,7 +2601,7 @@ EXPORT_SYMBOL_GPL(nfs_unlink); * and move the raw page into its mapping. */ int nfs_symlink(struct mnt_idmap *idmap, struct inode *dir, - struct dentry *dentry, const char *symname) + struct dentry *dentry, const char *symname, struct dirop_ret *ret) { struct folio *folio; char *kaddr; @@ -2647,7 +2657,8 @@ int nfs_symlink(struct mnt_idmap *idmap, struct inode *dir, EXPORT_SYMBOL_GPL(nfs_symlink); int -nfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry) +nfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry, + struct dirop_ret *ret) { struct inode *inode = d_inode(old_dentry); int error; @@ -2688,7 +2699,7 @@ nfs_unblock_rename(struct rpc_task *task, struct nfs_renamedata *data) * file in old_dir will go away when the last process iput()s the inode. * * FIXED. - * + * * It actually works quite well. One needs to have the possibility for * at least one ".nfs..." file in each directory the file ever gets * moved or linked to which happens automagically with the new @@ -2704,7 +2715,8 @@ nfs_unblock_rename(struct rpc_task *task, struct nfs_renamedata *data) */ int nfs_rename(struct mnt_idmap *idmap, struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, - struct dentry *new_dentry, unsigned int flags) + struct dentry *new_dentry, unsigned int flags, + struct dirop_ret *ret) { struct inode *old_inode = d_inode(old_dentry); struct inode *new_inode = d_inode(new_dentry); @@ -2744,16 +2756,12 @@ int nfs_rename(struct mnt_idmap *idmap, struct inode *old_dir, spin_unlock(&new_dentry->d_lock); - /* copy the target dentry's name */ - dentry = d_alloc(new_dentry->d_parent, - &new_dentry->d_name); - if (!dentry) - goto out; - /* silly-rename the existing target ... */ - err = nfs_sillyrename(new_dir, new_dentry); - if (err) + dentry = nfs_sillyrename(new_dir, new_dentry); + if (IS_ERR(dentry)) { + err = PTR_ERR(dentry); goto out; + } new_dentry = dentry; new_inode = NULL; @@ -2811,9 +2819,8 @@ int nfs_rename(struct mnt_idmap *idmap, struct inode *old_dir, } else if (error == -ENOENT) nfs_dentry_handle_enoent(old_dentry); - /* new dentry created? */ if (dentry) - dput(dentry); + nfs_sillyrename_finish(dentry); return error; } EXPORT_SYMBOL_GPL(nfs_rename); diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index f7dea7fe5ebc..ba00ffeb70ac 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -399,18 +399,21 @@ extern unsigned long nfs_access_cache_scan(struct shrinker *shrink, struct dentry *nfs_lookup(struct inode *, struct dentry *, unsigned int); void nfs_d_prune_case_insensitive_aliases(struct inode *inode); int nfs_create(struct mnt_idmap *, struct inode *, struct dentry *, - umode_t, bool); + umode_t, bool, struct dirop_ret *); struct dentry *nfs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, struct dirop_ret *); -int nfs_rmdir(struct inode *, struct dentry *); -int nfs_unlink(struct inode *, struct dentry *); +int nfs_rmdir(struct inode *, struct dentry *, struct dirop_ret *); +int nfs_unlink(struct inode *, struct dentry *, struct dirop_ret *); int nfs_symlink(struct mnt_idmap *, struct inode *, struct dentry *, - const char *); -int nfs_link(struct dentry *, struct inode *, struct dentry *); + const char *, struct dirop_ret *); +int nfs_link(struct dentry *, struct inode *, struct dentry *, struct dirop_ret *); int nfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, - dev_t); + dev_t, struct dirop_ret *); int nfs_rename(struct mnt_idmap *, struct inode *, struct dentry *, - struct inode *, struct dentry *, unsigned int); + struct inode *, struct dentry *, unsigned int, struct dirop_ret *); +int nfs_atomic_open_v23(struct inode *dir, struct dentry *dentry, + struct file *file, unsigned int open_flags, + umode_t mode, struct dirop_ret *); #ifdef CONFIG_NFS_V4_2 static inline __u32 nfs_access_xattr_mask(const struct nfs_server *server) @@ -707,7 +710,8 @@ extern struct rpc_task * nfs_async_rename(struct inode *old_dir, struct inode *new_dir, struct dentry *old_dentry, struct dentry *new_dentry, void (*complete)(struct rpc_task *, struct nfs_renamedata *)); -extern int nfs_sillyrename(struct inode *dir, struct dentry *dentry); +extern struct dentry *nfs_sillyrename(struct inode *dir, struct dentry *dentry); +extern void nfs_sillyrename_finish(struct dentry *dentry); /* direct.c */ void nfs_init_cinfo_from_dreq(struct nfs_commit_info *cinfo, diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c index 41797cbbb8dc..833e679d0a2b 100644 --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -1032,16 +1032,16 @@ static int nfs3_return_delegation(struct inode *inode) } static const struct inode_operations nfs3_dir_inode_operations = { - .create = nfs_create, - .atomic_open = nfs_atomic_open_v23, + .create_async = nfs_create, + .atomic_open_async = nfs_atomic_open_v23, .lookup = nfs_lookup, - .link = nfs_link, - .unlink = nfs_unlink, - .symlink = nfs_symlink, + .link_async = nfs_link, + .unlink_async = nfs_unlink, + .symlink_async = nfs_symlink, .mkdir_async = nfs_mkdir, - .rmdir = nfs_rmdir, - .mknod = nfs_mknod, - .rename = nfs_rename, + .rmdir_async = nfs_rmdir, + .mknod_async = nfs_mknod, + .rename_async = nfs_rename, .permission = nfs_permission, .getattr = nfs_getattr, .setattr = nfs_setattr, diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h index 7d383d29a995..65fbcef5830e 100644 --- a/fs/nfs/nfs4_fs.h +++ b/fs/nfs/nfs4_fs.h @@ -273,7 +273,7 @@ extern const struct dentry_operations nfs4_dentry_operations; /* dir.c */ int nfs_atomic_open(struct inode *, struct dentry *, struct file *, - unsigned, umode_t); + unsigned, umode_t, struct dirop_ret *); /* fs_context.c */ extern struct file_system_type nfs4_fs_type; diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index ef219968ed22..4fd312838bd3 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -10878,16 +10878,16 @@ static void nfs4_disable_swap(struct inode *inode) } static const struct inode_operations nfs4_dir_inode_operations = { - .create = nfs_create, + .create_async = nfs_create, .lookup = nfs_lookup, - .atomic_open = nfs_atomic_open, - .link = nfs_link, - .unlink = nfs_unlink, - .symlink = nfs_symlink, + .atomic_open_async = nfs_atomic_open, + .link_async = nfs_link, + .unlink_async = nfs_unlink, + .symlink_async = nfs_symlink, .mkdir_async = nfs_mkdir, - .rmdir = nfs_rmdir, - .mknod = nfs_mknod, - .rename = nfs_rename, + .rmdir_async = nfs_rmdir, + .mknod_async = nfs_mknod, + .rename_async = nfs_rename, .permission = nfs_permission, .getattr = nfs_getattr, .setattr = nfs_setattr, diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c index 7e8f6d8f02b4..211edd9f5115 100644 --- a/fs/nfs/proc.c +++ b/fs/nfs/proc.c @@ -704,16 +704,16 @@ static int nfs_return_delegation(struct inode *inode) } static const struct inode_operations nfs_dir_inode_operations = { - .create = nfs_create, + .create_async = nfs_create, .lookup = nfs_lookup, - .atomic_open = nfs_atomic_open_v23, - .link = nfs_link, - .unlink = nfs_unlink, - .symlink = nfs_symlink, + .atomic_open_async = nfs_atomic_open_v23, + .link_async = nfs_link, + .unlink_async = nfs_unlink, + .symlink_async = nfs_symlink, .mkdir_async = nfs_mkdir, - .rmdir = nfs_rmdir, - .mknod = nfs_mknod, - .rename = nfs_rename, + .rmdir_async = nfs_rmdir, + .mknod_async = nfs_mknod, + .rename_async = nfs_rename, .permission = nfs_permission, .getattr = nfs_getattr, .setattr = nfs_setattr, diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c index d44162d3a8f1..06b71ec9520c 100644 --- a/fs/nfs/unlink.c +++ b/fs/nfs/unlink.c @@ -430,6 +430,10 @@ nfs_complete_sillyrename(struct rpc_task *task, struct nfs_renamedata *data) * * The final cleanup is done during dentry_iput. * + * We exchange the original with the new (silly) dentries, and return + * the new dentry which will now have the original name. This ensures that + * the target name remains locked until the rename completes. + * * (Note: NFSv4 is stateful, and has opens, so in theory an NFSv4 server * could take responsibility for keeping open files referenced. The server * would also need to ensure that opened-but-deleted files were kept over @@ -438,7 +442,7 @@ nfs_complete_sillyrename(struct rpc_task *task, struct nfs_renamedata *data) * use to advertise that it does this; some day we may take advantage of * it.)) */ -int +struct dentry * nfs_sillyrename(struct inode *dir, struct dentry *dentry) { static unsigned int sillycounter; @@ -447,7 +451,8 @@ nfs_sillyrename(struct inode *dir, struct dentry *dentry) struct dentry *sdentry; struct inode *inode = d_inode(dentry); struct rpc_task *task; - int error = -EBUSY; + struct dentry *base; + int error = -EBUSY; dfprintk(VFS, "NFS: silly-rename(%pd2, ct=%d)\n", dentry, d_count(dentry)); @@ -461,10 +466,11 @@ nfs_sillyrename(struct inode *dir, struct dentry *dentry) fileid = NFS_FILEID(d_inode(dentry)); + base = d_find_alias(dir); sdentry = NULL; do { int slen; - dput(sdentry); + sillycounter++; slen = scnprintf(silly, sizeof(silly), SILLYNAME_PREFIX "%0*llx%0*x", @@ -474,14 +480,19 @@ nfs_sillyrename(struct inode *dir, struct dentry *dentry) dfprintk(VFS, "NFS: trying to rename %pd to %s\n", dentry, silly); - sdentry = lookup_one_len(silly, dentry->d_parent, slen); - /* - * N.B. Better to return EBUSY here ... it could be - * dangerous to delete the file while it's in use. - */ - if (IS_ERR(sdentry)) - goto out; - } while (d_inode(sdentry) != NULL); /* need negative lookup */ + sdentry = lookup_and_lock_one(NULL, silly, slen, + base, + LOOKUP_CREATE | LOOKUP_EXCL + | LOOKUP_RENAME_TARGET + | LOOKUP_PARENT_LOCKED); + } while (PTR_ERR_OR_ZERO(sdentry) == -EEXIST); /* need negative lookup */ + dput(base); + /* + * N.B. Better to return EBUSY here ... it could be + * dangerous to delete the file while it's in use. + */ + if (IS_ERR(sdentry)) + goto out; ihold(inode); @@ -515,7 +526,7 @@ nfs_sillyrename(struct inode *dir, struct dentry *dentry) NFS_INO_INVALID_CTIME | NFS_INO_REVAL_FORCED); spin_unlock(&inode->i_lock); - d_move(dentry, sdentry); + d_exchange(dentry, sdentry); break; case -ERESTARTSYS: /* The result of the rename is unknown. Play it safe by @@ -526,7 +537,16 @@ nfs_sillyrename(struct inode *dir, struct dentry *dentry) rpc_put_task(task); out_dput: iput(inode); - dput(sdentry); + if (!error) + return dentry; + done_lookup_and_lock(NULL, sdentry, LOOKUP_PARENT_LOCKED); + out: - return error; + return ERR_PTR(error); +} + +void nfs_sillyrename_finish(struct dentry *dentry) +{ + if (!IS_ERR(dentry)) + done_lookup_and_lock(NULL, dentry, LOOKUP_PARENT_LOCKED); } diff --git a/include/linux/namei.h b/include/linux/namei.h index 8ef7aa6ed64c..29903e2cdf97 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -95,7 +95,6 @@ struct dentry *__lookup_and_lock_one(struct mnt_idmap *idmap, unsigned int lookup_flags); void done_lookup_and_lock(struct dentry *base, struct dentry *dentry, unsigned int lookup_flags); -void __done_lookup_and_lock(struct dentry *dentry); extern int follow_down_one(struct path *); extern int follow_down(struct path *path, unsigned int flags); diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 67ae2c3f41d2..6f9f4adfdf4c 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -579,9 +579,6 @@ extern int nfs_may_open(struct inode *inode, const struct cred *cred, int openfl extern void nfs_access_zap_cache(struct inode *inode); extern int nfs_access_get_cached(struct inode *inode, const struct cred *cred, u32 *mask, bool may_block); -extern int nfs_atomic_open_v23(struct inode *dir, struct dentry *dentry, - struct file *file, unsigned int open_flags, - umode_t mode); /* * linux/fs/nfs/symlink.c