From patchwork Mon Aug 12 06:44:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 13760198 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3966433B3 for ; Mon, 12 Aug 2024 06:44:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=62.89.141.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; cv=none; b=ldpUD2dRsQf/9ZCb/RbT+UT64UY2eJiBiFexu3GyH8nwZrshSxgvhuaX/TAP0k8/Hs+LTcGjNScnhflaUJj527YKMlnPnq+fpKV3pbKaOlV427MdpERoeh9EJCvO6tqKl8JxxuCYZklO5mEplMPba3eMZ6JVTMAUs4BUVrCc4jM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; c=relaxed/simple; bh=AnlKRMEu6F5wekGBoYjz3wmWbF2kxLAF8j8uo+0Et+E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EUIpWCVb6aS+O5XEM/amUJMUJRbn47his0W1ZphnGlVQV5Xsa99T2hZ5y2DtHB5O3ELsdd3u10IZI3KO46y7sYf93ySRTHLhT1SeT0T+/P89ohGjnmrH/iDKMZcv6XtRjaw+BOm9IPm9Pvy5lAFyB/ZC700NUhJHS4FRdNAV9IE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk; spf=none smtp.mailfrom=ftp.linux.org.uk; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b=I43/VoYK; arc=none smtp.client-ip=62.89.141.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ftp.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b="I43/VoYK" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=i4RF1hpTwLP3KPLmFSjsAG/gDuyM11M/jJz1U0yuT3w=; b=I43/VoYKrsYEkLlgkjQhf57xDy L30nNS046m02nVisRLN7E4dBgZ/bTU89UXw7JJsreaZy0jOzl6cBaQbnKF4kFGkJYOrLlSatwRmzm DClX6AvbZ3DcDN3ItPfi3S56Vts2/CroB51MI5eVis9+7cnS8Vz4fjLinVWwn6IzmBF42rbUaDH8r M4NU+6AFZ5WaQsl2Q/IMnNueQz3NRzJMkmrHRiL9FJxZQlBlgP4b1x4lOiZ7L7VI6OXj6FZjqxtr5 Vbu87DTf+9wuS90hwLxdduq1CrlQGbdltpTMUEzn+dirW1jxbsDlbaFwZmIGi90gVJ+thEAMsSAnz uAy4khfA==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.98 #2 (Red Hat Linux)) id 1sdOn5-000000010U9-1oKx; Mon, 12 Aug 2024 06:44:27 +0000 From: Al Viro To: viro@zeniv.linux.org.uk Cc: brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org Subject: [PATCH 01/11] get rid of ...lookup...fdget_rcu() family Date: Mon, 12 Aug 2024 07:44:17 +0100 Message-ID: <20240812064427.240190-1-viro@zeniv.linux.org.uk> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240812064214.GH13701@ZenIV> References: <20240812064214.GH13701@ZenIV> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Al Viro Once upon a time, predecessors of those used to do file lookup without bumping a refcount, provided that caller held rcu_read_lock() across the lookup and whatever it wanted to read from the struct file found. When struct file allocation switched to SLAB_TYPESAFE_BY_RCU, that stopped being feasible and these primitives started to bump the file refcount for lookup result, requiring the caller to call fput() afterwards. But that turned them pointless - e.g. rcu_read_lock(); file = lookup_fdget_rcu(fd); rcu_read_unlock(); is equivalent to file = fget_raw(fd); and all callers of lookup_fdget_rcu() are of that form. Similarly, task_lookup_fdget_rcu() calls can be replaced with calling fget_task(). task_lookup_next_fdget_rcu() doesn't have direct counterparts, but its callers would be happier if we replaced it with an analogue that deals with RCU internally. Signed-off-by: Al Viro Reviewed-by: Christian Brauner --- arch/powerpc/platforms/cell/spufs/coredump.c | 4 +-- fs/file.c | 28 +++----------------- fs/gfs2/glock.c | 12 ++------- fs/notify/dnotify/dnotify.c | 5 +--- fs/proc/fd.c | 12 +++------ include/linux/fdtable.h | 4 --- include/linux/file.h | 1 + kernel/bpf/task_iter.c | 6 +---- kernel/kcmp.c | 4 +-- 9 files changed, 14 insertions(+), 62 deletions(-) diff --git a/arch/powerpc/platforms/cell/spufs/coredump.c b/arch/powerpc/platforms/cell/spufs/coredump.c index 18daafbe2e65..301ee7d8b7df 100644 --- a/arch/powerpc/platforms/cell/spufs/coredump.c +++ b/arch/powerpc/platforms/cell/spufs/coredump.c @@ -73,9 +73,7 @@ static struct spu_context *coredump_next_context(int *fd) return NULL; *fd = n - 1; - rcu_read_lock(); - file = lookup_fdget_rcu(*fd); - rcu_read_unlock(); + file = fget_raw(*fd); if (file) { ctx = SPUFS_I(file_inode(file))->i_ctx; get_spu_context(ctx); diff --git a/fs/file.c b/fs/file.c index 655338effe9c..ac9e04e97e4b 100644 --- a/fs/file.c +++ b/fs/file.c @@ -1064,29 +1064,7 @@ struct file *fget_task(struct task_struct *task, unsigned int fd) return file; } -struct file *lookup_fdget_rcu(unsigned int fd) -{ - return __fget_files_rcu(current->files, fd, 0); - -} -EXPORT_SYMBOL_GPL(lookup_fdget_rcu); - -struct file *task_lookup_fdget_rcu(struct task_struct *task, unsigned int fd) -{ - /* Must be called with rcu_read_lock held */ - struct files_struct *files; - struct file *file = NULL; - - task_lock(task); - files = task->files; - if (files) - file = __fget_files_rcu(files, fd, 0); - task_unlock(task); - - return file; -} - -struct file *task_lookup_next_fdget_rcu(struct task_struct *task, unsigned int *ret_fd) +struct file *fget_task_next(struct task_struct *task, unsigned int *ret_fd) { /* Must be called with rcu_read_lock held */ struct files_struct *files; @@ -1096,17 +1074,19 @@ struct file *task_lookup_next_fdget_rcu(struct task_struct *task, unsigned int * task_lock(task); files = task->files; if (files) { + rcu_read_lock(); for (; fd < files_fdtable(files)->max_fds; fd++) { file = __fget_files_rcu(files, fd, 0); if (file) break; } + rcu_read_unlock(); } task_unlock(task); *ret_fd = fd; return file; } -EXPORT_SYMBOL(task_lookup_next_fdget_rcu); +EXPORT_SYMBOL(fget_task_next); /* * Lightweight file lookup - no refcnt increment if fd table isn't shared. diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index 12a769077ea0..a4f5940c3e0a 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -34,7 +34,6 @@ #include #include #include -#include #include #include "gfs2.h" @@ -2765,25 +2764,18 @@ static struct file *gfs2_glockfd_next_file(struct gfs2_glockfd_iter *i) i->file = NULL; } - rcu_read_lock(); for(;; i->fd++) { - struct inode *inode; - - i->file = task_lookup_next_fdget_rcu(i->task, &i->fd); + i->file = fget_task_next(i->task, &i->fd); if (!i->file) { i->fd = 0; break; } - inode = file_inode(i->file); - if (inode->i_sb == i->sb) + if (file_inode(i->file)->i_sb == i->sb) break; - rcu_read_unlock(); fput(i->file); - rcu_read_lock(); } - rcu_read_unlock(); return i->file; } diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c index f3669403fabf..65521c01d2a4 100644 --- a/fs/notify/dnotify/dnotify.c +++ b/fs/notify/dnotify/dnotify.c @@ -16,7 +16,6 @@ #include #include #include -#include #include static int dir_notify_enable __read_mostly = 1; @@ -343,9 +342,7 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned int arg) new_fsn_mark = NULL; } - rcu_read_lock(); - f = lookup_fdget_rcu(fd); - rcu_read_unlock(); + f = fget_raw(fd); /* if (f != filp) means that we lost a race and another task/thread * actually closed the fd we are still playing with before we grabbed diff --git a/fs/proc/fd.c b/fs/proc/fd.c index 586bbc84ca04..077c51ba1ba7 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -116,9 +116,7 @@ static bool tid_fd_mode(struct task_struct *task, unsigned fd, fmode_t *mode) { struct file *file; - rcu_read_lock(); - file = task_lookup_fdget_rcu(task, fd); - rcu_read_unlock(); + file = fget_task(task, fd); if (file) { *mode = file->f_mode; fput(file); @@ -258,19 +256,17 @@ static int proc_readfd_common(struct file *file, struct dir_context *ctx, if (!dir_emit_dots(file, ctx)) goto out; - rcu_read_lock(); for (fd = ctx->pos - 2;; fd++) { struct file *f; struct fd_data data; char name[10 + 1]; unsigned int len; - f = task_lookup_next_fdget_rcu(p, &fd); + f = fget_task_next(p, &fd); ctx->pos = fd + 2LL; if (!f) break; data.mode = f->f_mode; - rcu_read_unlock(); fput(f); data.fd = fd; @@ -278,11 +274,9 @@ static int proc_readfd_common(struct file *file, struct dir_context *ctx, if (!proc_fill_cache(file, ctx, name, len, instantiate, p, &data)) - goto out; + break; cond_resched(); - rcu_read_lock(); } - rcu_read_unlock(); out: put_task_struct(p); return 0; diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index 2944d4aa413b..b395a34eebf4 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -93,10 +93,6 @@ static inline struct file *files_lookup_fd_locked(struct files_struct *files, un return files_lookup_fd_raw(files, fd); } -struct file *lookup_fdget_rcu(unsigned int fd); -struct file *task_lookup_fdget_rcu(struct task_struct *task, unsigned int fd); -struct file *task_lookup_next_fdget_rcu(struct task_struct *task, unsigned int *fd); - static inline bool close_on_exec(unsigned int fd, const struct files_struct *files) { return test_bit(fd, files_fdtable(files)->close_on_exec); diff --git a/include/linux/file.h b/include/linux/file.h index 237931f20739..006005f621d1 100644 --- a/include/linux/file.h +++ b/include/linux/file.h @@ -51,6 +51,7 @@ static inline void fdput(struct fd fd) extern struct file *fget(unsigned int fd); extern struct file *fget_raw(unsigned int fd); extern struct file *fget_task(struct task_struct *task, unsigned int fd); +extern struct file *fget_task_next(struct task_struct *task, unsigned int *fd); extern unsigned long __fdget(unsigned int fd); extern unsigned long __fdget_raw(unsigned int fd); extern unsigned long __fdget_pos(unsigned int fd); diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index 02aa9db8d796..7fe602ca74a0 100644 --- a/kernel/bpf/task_iter.c +++ b/kernel/bpf/task_iter.c @@ -5,7 +5,6 @@ #include #include #include -#include #include #include #include @@ -286,17 +285,14 @@ task_file_seq_get_next(struct bpf_iter_seq_task_file_info *info) curr_fd = 0; } - rcu_read_lock(); - f = task_lookup_next_fdget_rcu(curr_task, &curr_fd); + f = fget_task_next(curr_task, &curr_fd); if (f) { /* set info->fd */ info->fd = curr_fd; - rcu_read_unlock(); return f; } /* the current task is done, go to the next task */ - rcu_read_unlock(); put_task_struct(curr_task); if (info->common.type == BPF_TASK_ITER_TID) { diff --git a/kernel/kcmp.c b/kernel/kcmp.c index b0639f21041f..2c596851f8a9 100644 --- a/kernel/kcmp.c +++ b/kernel/kcmp.c @@ -63,9 +63,7 @@ get_file_raw_ptr(struct task_struct *task, unsigned int idx) { struct file *file; - rcu_read_lock(); - file = task_lookup_fdget_rcu(task, idx); - rcu_read_unlock(); + file = fget_task(task, idx); if (file) fput(file); From patchwork Mon Aug 12 06:44:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 13760194 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32BCD14C585 for ; Mon, 12 Aug 2024 06:44:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=62.89.141.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; cv=none; b=PGWVjG1WWmKeiTn6xmMh2nSC69HQpQWLnErvL0nArhLZfXmzKdH4Kl4yhdpsPQ2ln4Ew9bG/gYJD1fzSKL8HWbATAdENGcPoh3LPpwll2nsEqIpLx33d11/kYtVgBDu6rsaAUkbPLVDlPouE5iftsCog/PuA5h9cEs+UlMCxAq4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; c=relaxed/simple; bh=85MePcVqBl1NN+297UkqVpTPY5RIYmY9q83bG9KpOG8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=s2gRl8sTDuF7I9qYdpwFrxyW0qAMPEoHD7qqj+QJ0KXOvaJoK4dmPKhujI4gN9HBpFhI2vdeUHFAj1m0MNPhgKgHnQyxoHwcP/1CbM15WiOUCQo3ljrrf85LS9wmm/kiG4FVcgiLQ1kHhGMnE5bUKJ4NZ7Da3Fx64eB6fDRru1g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk; spf=none smtp.mailfrom=ftp.linux.org.uk; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b=v1PthMFW; arc=none smtp.client-ip=62.89.141.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ftp.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b="v1PthMFW" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=eYCys/js9QFJsuVAHMhWh06euzYWM0+Vx1YDZviUSTg=; b=v1PthMFWNfsjrxOqS5IE3INVvP MH/1PwVZAgnTB79Gf6ag4WCC7Uq2wc/AIHdWlV+R3qDaIiDvM4YAyw8CVjSrEhlzSrQsOr5t5agX2 a8P0y86ZaC7CLn1/8gFbNHDuR/06BVjEYA9KcE7u087UPkDlksKICyOxMZryvoCKwFCBFp2Jxv8qF Fqj1aSUfJfHG+BC6T4GTuKsGQIN6k+7hZvjdRcz9Ku1Gb4YDme5jWvb+3A9BZX3sEwPyi//VhmmKm pIG+fabkdPDn0jAZrO/uywm7ypA90KL5sGAsrEEFRZ7eCesZOlDxgDroxaTkneqmc2rbtz6RkaVuS Y/OMsOCw==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.98 #2 (Red Hat Linux)) id 1sdOn5-000000010UB-2GHl; Mon, 12 Aug 2024 06:44:27 +0000 From: Al Viro To: viro@zeniv.linux.org.uk Cc: brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org Subject: [PATCH 02/11] remove pointless includes of Date: Mon, 12 Aug 2024 07:44:18 +0100 Message-ID: <20240812064427.240190-2-viro@zeniv.linux.org.uk> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240812064427.240190-1-viro@zeniv.linux.org.uk> References: <20240812064214.GH13701@ZenIV> <20240812064427.240190-1-viro@zeniv.linux.org.uk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Al Viro some of those used to be needed, some had been cargo-culted for no reason... Signed-off-by: Al Viro Reviewed-by: Christian Brauner --- fs/fcntl.c | 1 - fs/file_table.c | 1 - fs/notify/fanotify/fanotify.c | 1 - fs/notify/fanotify/fanotify_user.c | 1 - fs/overlayfs/copy_up.c | 1 - fs/proc/base.c | 1 - io_uring/io_uring.c | 1 - kernel/bpf/bpf_inode_storage.c | 1 - kernel/bpf/bpf_task_storage.c | 1 - kernel/bpf/token.c | 1 - kernel/exit.c | 1 - kernel/module/dups.c | 1 - kernel/module/kmod.c | 1 - kernel/umh.c | 1 - net/handshake/request.c | 1 - security/apparmor/domain.c | 1 - 16 files changed, 16 deletions(-) diff --git a/fs/fcntl.c b/fs/fcntl.c index 300e5d9ad913..56262bdbc544 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -12,7 +12,6 @@ #include #include #include -#include #include #include #include diff --git a/fs/file_table.c b/fs/file_table.c index ca7843dde56d..15adc085df4f 100644 --- a/fs/file_table.c +++ b/fs/file_table.c @@ -9,7 +9,6 @@ #include #include #include -#include #include #include #include diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c index 224bccaab4cc..24c7c5df4998 100644 --- a/fs/notify/fanotify/fanotify.c +++ b/fs/notify/fanotify/fanotify.c @@ -1,6 +1,5 @@ // SPDX-License-Identifier: GPL-2.0 #include -#include #include #include #include diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c index 9ec313e9f6e1..cc91977cf202 100644 --- a/fs/notify/fanotify/fanotify_user.c +++ b/fs/notify/fanotify/fanotify_user.c @@ -1,7 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include #include -#include #include #include #include diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c index a5ef2005a2cc..73b502524d1c 100644 --- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -16,7 +16,6 @@ #include #include #include -#include #include #include #include "overlayfs.h" diff --git a/fs/proc/base.c b/fs/proc/base.c index 72a1acd03675..d4ecad2b0a2e 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -58,7 +58,6 @@ #include #include #include -#include #include #include #include diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 3942db160f18..0b4be6d5edaf 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -51,7 +51,6 @@ #include #include #include -#include #include #include #include diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c index b0ef45db207c..f8b97d8a874a 100644 --- a/kernel/bpf/bpf_inode_storage.c +++ b/kernel/bpf/bpf_inode_storage.c @@ -16,7 +16,6 @@ #include #include #include -#include #include DEFINE_BPF_STORAGE_CACHE(inode_cache); diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c index adf6dfe0ba68..1eb9852a9f8e 100644 --- a/kernel/bpf/bpf_task_storage.c +++ b/kernel/bpf/bpf_task_storage.c @@ -16,7 +16,6 @@ #include #include #include -#include #include DEFINE_BPF_STORAGE_CACHE(task_cache); diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c index d6ccf8d00eab..3ea6a7505662 100644 --- a/kernel/bpf/token.c +++ b/kernel/bpf/token.c @@ -1,6 +1,5 @@ #include #include -#include #include #include #include diff --git a/kernel/exit.c b/kernel/exit.c index 7430852a8571..d441193a4537 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -25,7 +25,6 @@ #include #include #include -#include #include #include #include diff --git a/kernel/module/dups.c b/kernel/module/dups.c index 9a92f2f8c9d3..bd2149fbe117 100644 --- a/kernel/module/dups.c +++ b/kernel/module/dups.c @@ -18,7 +18,6 @@ #include #include #include -#include #include #include #include diff --git a/kernel/module/kmod.c b/kernel/module/kmod.c index 0800d9891692..25f253812512 100644 --- a/kernel/module/kmod.c +++ b/kernel/module/kmod.c @@ -15,7 +15,6 @@ #include #include #include -#include #include #include #include diff --git a/kernel/umh.c b/kernel/umh.c index ff1f13a27d29..be9234270777 100644 --- a/kernel/umh.c +++ b/kernel/umh.c @@ -13,7 +13,6 @@ #include #include #include -#include #include #include #include diff --git a/net/handshake/request.c b/net/handshake/request.c index 94d5cef3e048..274d2c89b6b2 100644 --- a/net/handshake/request.c +++ b/net/handshake/request.c @@ -13,7 +13,6 @@ #include #include #include -#include #include #include diff --git a/security/apparmor/domain.c b/security/apparmor/domain.c index 571158ec6188..2bc34dce9a46 100644 --- a/security/apparmor/domain.c +++ b/security/apparmor/domain.c @@ -9,7 +9,6 @@ */ #include -#include #include #include #include From patchwork Mon Aug 12 06:44:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 13760193 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6487814D42C for ; Mon, 12 Aug 2024 06:44:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=62.89.141.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445070; cv=none; b=LpQ0dtEiOyKr1bAhAbbeAbDwJibL1fWrwJg0Djp8WdsHIhOHINm7KGe4o1gcn8efIZk69qprVj3Mw0aDDRWyzfiKTMr8lZbz6kEivDsiOhum/kKVz2eieMPi3d6+cSZEUFNgI9uenGjm86tG6UqkgtIo1JvNcYnnzHK/44suTtw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445070; c=relaxed/simple; bh=W4i8DrNlAD+0853nsB7Tg7xS6Y36waP1iK/++G3HTnk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HMhkR9s/V1B4S82Svh83t9SCfWRI9B0zLGy6NbeFb/mITmNBvILGn3SOOZRLlrXh0TMevsrNpmSe9j6FTIwdEKI2VJQDZnHFRldUw7IC+S8o4gVNjnOWAvRdA+MRLIalK6HhrjP3iL4Vw6gofFvzSjgswoBBbeClBEEtCxANBXw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk; spf=none smtp.mailfrom=ftp.linux.org.uk; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b=lbr55Pss; arc=none smtp.client-ip=62.89.141.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ftp.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b="lbr55Pss" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=9FaAMy0SP7shTdetxWl6ggiRlTf8l5arjYn+o56LbKk=; b=lbr55Pssv5zRvu/kmLyG7md0NF YYeit1A29oOk0H4F4ueGQS7D43JpyV5SN1zwUbL7WAg6oQOrNfDdYpkZRJa57XZTEEtJGaT6O1+HO WM48hIFLqlPUUWAwQMhAgm9yLsORdaWqxdzLkdDfGqrHT2xFr5+XiJzEIZ7DbQxyGOyhG2UBtQNAW lHH0kU7FoGMUkuqAKCEHy/vi2O3ViYCXEP6ROL7JQXUu6kMBuj6Vw0tbhkTf8e5hev2R4XQB66TzU gggluHJetHDlUaeGIXAO+g2ABViA25rQ67rY8bf2coU5Xjb/S6Rx9ATtnYst24D+6Vr74FoiwU9pe 7knNI5CA==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.98 #2 (Red Hat Linux)) id 1sdOn5-000000010UI-2vy1; Mon, 12 Aug 2024 06:44:27 +0000 From: Al Viro To: viro@zeniv.linux.org.uk Cc: brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org Subject: [PATCH 03/11] close_files(): don't bother with xchg() Date: Mon, 12 Aug 2024 07:44:19 +0100 Message-ID: <20240812064427.240190-3-viro@zeniv.linux.org.uk> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240812064427.240190-1-viro@zeniv.linux.org.uk> References: <20240812064214.GH13701@ZenIV> <20240812064427.240190-1-viro@zeniv.linux.org.uk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Al Viro At that point nobody else has references to the victim files_struct; as the matter of fact, the caller will free it immediately after close_files() returns, with no RCU delays or anything of that sort. That's why we are not protecting against fdtable reallocation on expansion, not cleaning the bitmaps, etc. There's no point zeroing the pointers in ->fd[] either, let alone make that an atomic operation. Signed-off-by: Al Viro --- fs/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/file.c b/fs/file.c index ac9e04e97e4b..313cfb860941 100644 --- a/fs/file.c +++ b/fs/file.c @@ -428,7 +428,7 @@ static struct fdtable *close_files(struct files_struct * files) set = fdt->open_fds[j++]; while (set) { if (set & 1) { - struct file * file = xchg(&fdt->fd[i], NULL); + struct file *file = fdt->fd[i]; if (file) { filp_close(file, files); cond_resched(); From patchwork Mon Aug 12 06:44:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 13760200 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 67B3914EC64 for ; Mon, 12 Aug 2024 06:44:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=62.89.141.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; cv=none; b=FLyRQQglYBjQuhmXFS/z8cSO/b/qK+i1lsyZ11tVYeq/ndjENDjwo/svHYKeWPR2rCfpA5HET0o4LaRfDR0aXAddhzbPHAdbBFeKygdkBbsUMACqtDBJFbW3YFR4T7UkcmhaLB1yuB+dnewCnJmpQLwtQzr4CpNFcZ4KV98nmZc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; c=relaxed/simple; bh=UBt8u6dsKDs4nj9QRUcas17pQBOD/9lbg4mnk6E7eh0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TNbo9B+gk75DVP98W7iZYvKC+9BMl/46MF8GtpwF2oAD2s+QVJV+BFlJZfZjXhQHZ39syN5VIN2U6v1eqqwVImGjoRdO19E+3RluJb22A/Hzr6+ecOHNsdoMTh4lnT81uiqXNZfgS2e2cXycognaS59YB3FOnFctcvpmxrrWnfg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk; spf=none smtp.mailfrom=ftp.linux.org.uk; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b=oRIqUiLV; arc=none smtp.client-ip=62.89.141.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ftp.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b="oRIqUiLV" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=SNRleuGbC5ZjtR+vITaWBkdhY7r+Rw7PevIVgf82NJk=; b=oRIqUiLV3jttyEV8jca8v9lZwJ fJ4rtOninuOrriDSaGwKkBEmRbJZgcRnfeHcW47c6Hi3Hs8NeKeRlJAnTX/kyV/0d9VMWiCAgQT5p xEj7zWlInjroB6pw87U8GRPgp8EbLcbiqhSXDgi+bprni7ev9nwaFD+v7OarPZT+0JzoQtDjwWYTM +ODQKaLIsfXV5ijEUHQ+N1Hk72zX4Qac2c+ctWGI/6OLiNtiKkkya8YyodpJzby+huXNdU/WD6g0v IwNX8ZP3NxnszHqlo8SSfuN2vsgpXoZlwuXGNAp9Y6HHJ2yNR5XNlRfwDiOtuL4P18TT2oVwbOGSy 1krT2nrw==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.98 #2 (Red Hat Linux)) id 1sdOn5-000000010UP-3uU9; Mon, 12 Aug 2024 06:44:28 +0000 From: Al Viro To: viro@zeniv.linux.org.uk Cc: brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org Subject: [PATCH 04/11] proc_fd_getattr(): don't bother with S_ISDIR() check Date: Mon, 12 Aug 2024 07:44:20 +0100 Message-ID: <20240812064427.240190-4-viro@zeniv.linux.org.uk> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240812064427.240190-1-viro@zeniv.linux.org.uk> References: <20240812064214.GH13701@ZenIV> <20240812064427.240190-1-viro@zeniv.linux.org.uk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Al Viro that thing is callable only as ->i_op->getattr() instance and only for directory inodes (/proc/*/fd and /proc/*/task/*/fd) Reviewed-by: Christian Brauner Signed-off-by: Al Viro --- fs/proc/fd.c | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/fs/proc/fd.c b/fs/proc/fd.c index 077c51ba1ba7..f3c5767db3d4 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -351,18 +351,9 @@ static int proc_fd_getattr(struct mnt_idmap *idmap, u32 request_mask, unsigned int query_flags) { struct inode *inode = d_inode(path->dentry); - int rv = 0; generic_fillattr(&nop_mnt_idmap, request_mask, inode, stat); - - /* If it's a directory, put the number of open fds there */ - if (S_ISDIR(inode->i_mode)) { - rv = proc_readfd_count(inode, &stat->size); - if (rv < 0) - return rv; - } - - return rv; + return proc_readfd_count(inode, &stat->size); } const struct inode_operations proc_fd_inode_operations = { From patchwork Mon Aug 12 06:44:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 13760196 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA3C114F9C9 for ; Mon, 12 Aug 2024 06:44:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=62.89.141.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; cv=none; b=ng3V5zeClq6KL+WllI1JJtv57xEWgatL76Nrf0JO+zy1OMJM7i4Ns25qkdr7en8VkEenY22kHCy/POCqA6XTZUkXOMcqyyLeLY8Tr10S5UdO5xyoAtLlgeH7jIcB74obB085b4Yw3PGgSyJijOGhDL9Ao/IUOMGGlq7T4WxcTBo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; c=relaxed/simple; bh=vEyMMs0IkIQgdbGVHHSac4MwJ6mb2hJ+cyfbSx6AgI8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ENqr/uMj8VVskiL3X2kwbhtcMqueGmHFjfIextsAiEHkBIZDvxpQf6rnodBCqyqVJHk4oGD+zsn26nf8JAazkmPxN7TrmpacFc4kj1QQgOMI9of6dzc2pJj1EpDAJJ2JSHGxQg+JFPF3v6p0Lzn36/xzOj73twbO/eg7dJ/xOqQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk; spf=none smtp.mailfrom=ftp.linux.org.uk; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b=s/Gr4s42; arc=none smtp.client-ip=62.89.141.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ftp.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b="s/Gr4s42" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=vHS5EobCcBHhSj/ruuL0apPSGFAHrZXKydfkyRUFfP4=; b=s/Gr4s42Yn41AH5zkDtouRRRwd BNPBhNnvFht/EFcG0sK8NdmpoWgfSTsth3ErkhH+dLXYLjzbbMCb/26qX1Stu3bOxNxwBMDJCQthf 9u2TPU3OAcrr2TQCrU96fk2bWqUvobFbnkjsaNkP7nPF6k0Zvhwh/EuEFDbsb7y8JNIFA3Ksy+Y4h qRd62hZhFSF1Zj47HDZyfAMXlpyXrTTKx2tfvdjj3R9ExS4ajtKbPgofcUbg9sj4t6o8CILe7q4Rj o4rtLfis9d8+48XPTbIItwlUa5kQ+uZ1qxNzfSH577O7NtyuI/0oxEAAQZzscZ1n5cF5ekQzWZPqQ LlOsp5Mw==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.98 #2 (Red Hat Linux)) id 1sdOn6-000000010UW-0KUW; Mon, 12 Aug 2024 06:44:28 +0000 From: Al Viro To: viro@zeniv.linux.org.uk Cc: brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org Subject: [PATCH 05/11] move close_range(2) into fs/file.c, fold __close_range() into it Date: Mon, 12 Aug 2024 07:44:21 +0100 Message-ID: <20240812064427.240190-5-viro@zeniv.linux.org.uk> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240812064427.240190-1-viro@zeniv.linux.org.uk> References: <20240812064214.GH13701@ZenIV> <20240812064427.240190-1-viro@zeniv.linux.org.uk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Al Viro We never had callers for __close_range() except for close_range(2) itself. Nothing of that sort has appeared in four years and if any users do show up, we can always separate those suckers again. Reviewed-by: Christian Brauner Signed-off-by: Al Viro --- fs/file.c | 6 ++++-- fs/open.c | 17 ----------------- include/linux/fdtable.h | 1 - 3 files changed, 4 insertions(+), 20 deletions(-) diff --git a/fs/file.c b/fs/file.c index 313cfb860941..fbcd3da46109 100644 --- a/fs/file.c +++ b/fs/file.c @@ -728,7 +728,7 @@ static inline void __range_close(struct files_struct *files, unsigned int fd, } /** - * __close_range() - Close all file descriptors in a given range. + * sys_close_range() - Close all file descriptors in a given range. * * @fd: starting file descriptor to close * @max_fd: last file descriptor to close @@ -736,8 +736,10 @@ static inline void __range_close(struct files_struct *files, unsigned int fd, * * This closes a range of file descriptors. All file descriptors * from @fd up to and including @max_fd are closed. + * Currently, errors to close a given file descriptor are ignored. */ -int __close_range(unsigned fd, unsigned max_fd, unsigned int flags) +SYSCALL_DEFINE3(close_range, unsigned int, fd, unsigned int, max_fd, + unsigned int, flags) { struct task_struct *me = current; struct files_struct *cur_fds = me->files, *fds = NULL; diff --git a/fs/open.c b/fs/open.c index 22adbef7ecc2..25443d846d5e 100644 --- a/fs/open.c +++ b/fs/open.c @@ -1575,23 +1575,6 @@ SYSCALL_DEFINE1(close, unsigned int, fd) return retval; } -/** - * sys_close_range() - Close all file descriptors in a given range. - * - * @fd: starting file descriptor to close - * @max_fd: last file descriptor to close - * @flags: reserved for future extensions - * - * This closes a range of file descriptors. All file descriptors - * from @fd up to and including @max_fd are closed. - * Currently, errors to close a given file descriptor are ignored. - */ -SYSCALL_DEFINE3(close_range, unsigned int, fd, unsigned int, max_fd, - unsigned int, flags) -{ - return __close_range(fd, max_fd, flags); -} - /* * This routine simulates a hangup on the tty, to arrange that users * are given clean terminals at login time. diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index b395a34eebf4..42cadad89f99 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -109,7 +109,6 @@ int iterate_fd(struct files_struct *, unsigned, const void *); extern int close_fd(unsigned int fd); -extern int __close_range(unsigned int fd, unsigned int max_fd, unsigned int flags); extern struct file *file_close_fd(unsigned int fd); extern int unshare_fd(unsigned long unshare_flags, unsigned int max_fds, struct files_struct **new_fdp); From patchwork Mon Aug 12 06:44:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 13760195 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CC8031509A0 for ; Mon, 12 Aug 2024 06:44:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=62.89.141.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; cv=none; b=DxFXviJTPL1fpfZ3/RqSJtKAIPMAAQPwABIyyFVWpb+O1v6CxtgENAJcuK4zzL6MpCC8NvBr8CwACMP3Y5yJDV81Q42fFdhjuaAE+Kk8PuU+nyN9GcUNHxOScm3m6tMzyZWnf/rNlTAhbtFof0WOoGhP0Qv77ChwlsSHId4b0lo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; c=relaxed/simple; bh=GfAxYkVzP3oh0/dqR9BmRPj3fO6ymq/f7y+BqaErNFA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DRU8KV6dHkw6M0Khx/pv8iwoTjxYRKb0D1dRhpouxvJ9lv1cZ7Cl6DyvmWWutfLAzjsVisW0DJ0hbi5EyzpY9YcHejk4Jh0qUG0CqJay3TXWKBhQK6C66X+GLazncZQ/CSe7yXvqo7PWX0bx1as62DgC6Ft3z5a+45O3TGNneK8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk; spf=none smtp.mailfrom=ftp.linux.org.uk; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b=aHWmBW19; arc=none smtp.client-ip=62.89.141.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ftp.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b="aHWmBW19" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=tWLyQ/FDhiruX0I1ocFEjjB2P486jShUsYpSsgby//E=; b=aHWmBW19SSLfjN+DSDMnlNRiry sNzHx8AOW03LHLuVckYncHYBdYmdF14cN5FW9muo2MgeF1IKX/qFc2I4xaYMDiBAeInY5nyXmiWAo mgih6RNtWC5ND2ya1/l8ovNkCJYFSKd85V4YmExEyfWKfgt4OiQrSlD3YbPGHJ4xfruqiIwC9300A p95pwrVXqzXp3EBcnpHJswJuUGF/k/hmzmFy4NzIcPQUEuq6FFnsVmz2yE8otEMxp3Tlbnl+l7kvF BFgKgHtUGGIZF4unCbAuEFub/v8WEPYtB4BwU/Ys2cLjSnm13bGW8ncgEf8BAireU70bFbpb+IxRY DKhjVttg==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.98 #2 (Red Hat Linux)) id 1sdOn6-000000010Ud-10Mw; Mon, 12 Aug 2024 06:44:28 +0000 From: Al Viro To: viro@zeniv.linux.org.uk Cc: brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org Subject: [PATCH 06/11] sane_fdtable_size(): don't bother looking at descriptors we are not going to copy Date: Mon, 12 Aug 2024 07:44:22 +0100 Message-ID: <20240812064427.240190-6-viro@zeniv.linux.org.uk> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240812064427.240190-1-viro@zeniv.linux.org.uk> References: <20240812064214.GH13701@ZenIV> <20240812064427.240190-1-viro@zeniv.linux.org.uk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Al Viro when given a max_fds argument lower than that current size (that can happen when called from close_range(..., CLOSE_RANGE_UNSHARE)), we can ignore all descriptors >= max_fds. Signed-off-by: Al Viro Reviewed-by: Christian Brauner --- fs/file.c | 43 +++++++++++++++++++++++-------------------- 1 file changed, 23 insertions(+), 20 deletions(-) diff --git a/fs/file.c b/fs/file.c index fbcd3da46109..894bd18241b5 100644 --- a/fs/file.c +++ b/fs/file.c @@ -272,20 +272,6 @@ static inline bool fd_is_open(unsigned int fd, const struct fdtable *fdt) return test_bit(fd, fdt->open_fds); } -static unsigned int count_open_files(struct fdtable *fdt) -{ - unsigned int size = fdt->max_fds; - unsigned int i; - - /* Find the last open fd */ - for (i = size / BITS_PER_LONG; i > 0; ) { - if (fdt->open_fds[--i]) - break; - } - i = (i + 1) * BITS_PER_LONG; - return i; -} - /* * Note that a sane fdtable size always has to be a multiple of * BITS_PER_LONG, since we have bitmaps that are sized by this. @@ -297,16 +283,33 @@ static unsigned int count_open_files(struct fdtable *fdt) * * Rather than make close_range() have to worry about this, * just make that BITS_PER_LONG alignment be part of a sane - * fdtable size. Becuase that's really what it is. + * fdtable size. Because that's really what it is. */ static unsigned int sane_fdtable_size(struct fdtable *fdt, unsigned int max_fds) { - unsigned int count; + const unsigned int min_words = BITS_TO_LONGS(NR_OPEN_DEFAULT); // 1 + unsigned long mask; + unsigned int words; + + if (max_fds > fdt->max_fds) + max_fds = fdt->max_fds; + + if (max_fds == NR_OPEN_DEFAULT) + return NR_OPEN_DEFAULT; + + /* + * What follows is a simplified find_last_bit(). There's no point + * finding exact last bit, when we are going to round it up anyway. + */ + words = BITS_TO_LONGS(max_fds); + mask = BITMAP_LAST_WORD_MASK(max_fds); + + while (words > min_words && !(fdt->open_fds[words - 1] & mask)) { + mask = ~0UL; + words--; + } - count = count_open_files(fdt); - if (max_fds < NR_OPEN_DEFAULT) - max_fds = NR_OPEN_DEFAULT; - return ALIGN(min(count, max_fds), BITS_PER_LONG); + return words * BITS_PER_LONG; } /* From patchwork Mon Aug 12 06:44:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 13760197 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EA37B1509A8 for ; Mon, 12 Aug 2024 06:44:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=62.89.141.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; cv=none; b=JZ1kHMMWF2C/0rlsUG4NhsmC6dF9E9jAPRuaIFoSGYFx8EpEx/z249KLa4ZqdTzEI3kY9uLIKXPROI7EbYm+GqIq8qxN2+fUybaxflxA96bkMoxpllrvkZltS3WBfotdcpwcp741hlO9nBBzvIZHKV0njaKYfJg80dZSbmezj7w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; c=relaxed/simple; bh=+qcGKHob8p3oDUUG+zfvCzxe5AO0Jfpmj0UcT7GDRAQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qbMG+Xn75t3c+38wfXyPeDsetMbkzz4tpFYZtbuo5Cn269npfUyPuQJYm2JAS52XbN+/486GykyNFWk/Z9xVj1Ah7QqcFo+Af9bg6LtUhXc73Rdh2XclOZKZwSHnrxmZE0Q9NgjhJFl8Xok/8q0ocGBdQlAh8WoyfSc40uNbm7I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk; spf=none smtp.mailfrom=ftp.linux.org.uk; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b=R/7b+6U3; arc=none smtp.client-ip=62.89.141.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ftp.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b="R/7b+6U3" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=UMaZZ1YtEb3YlU3PlNTlA4FF5/bJC7A8/2t79j6VGjg=; b=R/7b+6U35qwCbGnSwl7Cx/zn4d jjv9Wd+ffW+FZ/wiPrxyREONJKAFWfc2ay3O1RS6F8bNSk4vLfDg+8RxSnCmBy5by2RsZJy413bBe OlFztr4scZMIjJ24qkp6d8DDDwspsERkeCeX69mD5/Tw8azj6qRlon0+JWEk6MpQmAYBxRj6xxTRB jbUO3ybyLrISNaIDQU4zb9e4/v0gN/FnCwIafQsoPh6QNBROGB9+adn2C6n9uOJDhHWrPrTOvrmNm Op26F77WmoyU9QXI0YEJjhoiPGYcF9IwGqtDDFZCmDNpQytgJ7egcZ6fpxyMO/qbVgB67d/WGwKa2 d6ZJK94g==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.98 #2 (Red Hat Linux)) id 1sdOn6-000000010Uk-1afS; Mon, 12 Aug 2024 06:44:28 +0000 From: Al Viro To: viro@zeniv.linux.org.uk Cc: brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org Subject: [PATCH 07/11] fs/file.c: remove sanity_check and add likely/unlikely in alloc_fd() Date: Mon, 12 Aug 2024 07:44:23 +0100 Message-ID: <20240812064427.240190-7-viro@zeniv.linux.org.uk> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240812064427.240190-1-viro@zeniv.linux.org.uk> References: <20240812064214.GH13701@ZenIV> <20240812064427.240190-1-viro@zeniv.linux.org.uk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Al Viro From: Yu Ma alloc_fd() has a sanity check inside to make sure the struct file mapping to the allocated fd is NULL. Remove this sanity check since it can be assured by exisitng zero initilization and NULL set when recycling fd. Meanwhile, add likely/unlikely and expand_file() call avoidance to reduce the work under file_lock. Reviewed-by: Jan Kara Reviewed-by: Tim Chen Signed-off-by: Yu Ma Link: https://lore.kernel.org/r/20240717145018.3972922-2-yu.ma@intel.com Signed-off-by: Christian Brauner Signed-off-by: Al Viro --- fs/file.c | 33 ++++++++++++++------------------- 1 file changed, 14 insertions(+), 19 deletions(-) diff --git a/fs/file.c b/fs/file.c index 894bd18241b5..e217247006a2 100644 --- a/fs/file.c +++ b/fs/file.c @@ -514,7 +514,7 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) if (fd < files->next_fd) fd = files->next_fd; - if (fd < fdt->max_fds) + if (likely(fd < fdt->max_fds)) fd = find_next_fd(fdt, fd); /* @@ -522,19 +522,21 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) * will limit the total number of files that can be opened. */ error = -EMFILE; - if (fd >= end) + if (unlikely(fd >= end)) goto out; - error = expand_files(files, fd); - if (error < 0) - goto out; + if (unlikely(fd >= fdt->max_fds)) { + error = expand_files(files, fd); + if (error < 0) + goto out; - /* - * If we needed to expand the fs array we - * might have blocked - try again. - */ - if (error) - goto repeat; + /* + * If we needed to expand the fs array we + * might have blocked - try again. + */ + if (error) + goto repeat; + } if (start <= files->next_fd) files->next_fd = fd + 1; @@ -545,13 +547,6 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) else __clear_close_on_exec(fd, fdt); error = fd; -#if 1 - /* Sanity check */ - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { - printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); - rcu_assign_pointer(fdt->fd[fd], NULL); - } -#endif out: spin_unlock(&files->file_lock); @@ -617,7 +612,7 @@ void fd_install(unsigned int fd, struct file *file) rcu_read_unlock_sched(); spin_lock(&files->file_lock); fdt = files_fdtable(files); - BUG_ON(fdt->fd[fd] != NULL); + WARN_ON(fdt->fd[fd] != NULL); rcu_assign_pointer(fdt->fd[fd], file); spin_unlock(&files->file_lock); return; From patchwork Mon Aug 12 06:44:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 13760202 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 10A8E1509B0 for ; Mon, 12 Aug 2024 06:44:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=62.89.141.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445072; cv=none; b=Uz1Zsc6P/Hqd3te5605avJyLAR3/DZg6187L8n1mHDiC+m18uofhuqmSI0Io2piFSltx2zbotHuGzhhLTROwB9TxrnJBfxJZlFFf9uOkOsREldfv5P52bKBcUz5dHgTOJMJrVzgUBRLzoTsAxNWIwBu7mGZdp9nPYSO6fRVkgPc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445072; c=relaxed/simple; bh=OoMkfFG7NYPDQeGPh9F2RtPQfrI1qGuFCoG97+Vpxlk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jSkaVlns0yQuMaXyyflVUZ1BPksrlOUBd0decQs3bV4muEA6+nGoAJifuDGTcXAwZU/BRtSzIr8SbHh8NyUK6eP4odugeeCYyg+jOdCmumUMwhWLuzCKZJ9wYyzuzQBADqy0tAim8OOHIYXbFTbbldqk4C8FWfQy+tunTXmgQIk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk; spf=none smtp.mailfrom=ftp.linux.org.uk; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b=X2LdiexZ; arc=none smtp.client-ip=62.89.141.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ftp.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b="X2LdiexZ" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=ea9+Pbrzm8vZKUAnqdPEgFi3DlFNn7sRdg4QsDGH8vo=; b=X2LdiexZGykV5tsG9z/QJ7jpYt 71m9sDxd8ft05mT8jaNo3ZATYjHCkGsvWF33nq/gTdtnIlEfhmaxoni+stegIc5jcdYDb8WaDVduK uJCM3IlgYb1Z/4sZwABWwnAFqaEzqgriVsSEkJQFlfhstDTPDvfgF5dzTranWJwV/gHfwBWzFsi0J qA2OAuo9UufnnwSyWI3J/L7ql3aI1ZkS8dcFMc94cCgGs/eNvs0q/q7unRn+XZMxOhx0I4SFwfALm smpOjL/C4rq5o2CCZqeTENTlVcUoqyjEPLdcSC+fbo7h5G4RcAmTf8sJ9tMbReVpMxxhj/fLBSHyk IJDCMRpg==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.98 #2 (Red Hat Linux)) id 1sdOn6-000000010Uo-1zvF; Mon, 12 Aug 2024 06:44:28 +0000 From: Al Viro To: viro@zeniv.linux.org.uk Cc: brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org Subject: [PATCH 08/11] fs/file.c: conditionally clear full_fds Date: Mon, 12 Aug 2024 07:44:24 +0100 Message-ID: <20240812064427.240190-8-viro@zeniv.linux.org.uk> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240812064427.240190-1-viro@zeniv.linux.org.uk> References: <20240812064214.GH13701@ZenIV> <20240812064427.240190-1-viro@zeniv.linux.org.uk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Al Viro From: Yu Ma 64 bits in open_fds are mapped to a common bit in full_fds_bits. It is very likely that a bit in full_fds_bits has been cleared before in __clear_open_fds()'s operation. Check the clear bit in full_fds_bits before clearing to avoid unnecessary write and cache bouncing. See commit fc90888d07b8 ("vfs: conditionally clear close-on-exec flag") for a similar optimization. take stock kernel with patch 1 as baseline, it improves pts/blogbench-1.1.0 read for 13%, and write for 5% on Intel ICX 160 cores configuration with v6.10-rc7. Reviewed-by: Jan Kara Reviewed-by: Tim Chen Signed-off-by: Yu Ma Link: https://lore.kernel.org/r/20240717145018.3972922-3-yu.ma@intel.com Signed-off-by: Christian Brauner Signed-off-by: Al Viro --- fs/file.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/file.c b/fs/file.c index e217247006a2..0340c811b22a 100644 --- a/fs/file.c +++ b/fs/file.c @@ -264,7 +264,9 @@ static inline void __set_open_fd(unsigned int fd, struct fdtable *fdt) static inline void __clear_open_fd(unsigned int fd, struct fdtable *fdt) { __clear_bit(fd, fdt->open_fds); - __clear_bit(fd / BITS_PER_LONG, fdt->full_fds_bits); + fd /= BITS_PER_LONG; + if (test_bit(fd, fdt->full_fds_bits)) + __clear_bit(fd, fdt->full_fds_bits); } static inline bool fd_is_open(unsigned int fd, const struct fdtable *fdt) From patchwork Mon Aug 12 06:44:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 13760199 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E4BA152176 for ; Mon, 12 Aug 2024 06:44:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=62.89.141.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; cv=none; b=TNfVL3M4eD+6Z3OMLpjGQLApCkBMkKmYMuzRxpQ+Air4Ouuhs4/AeYBCYxXICWoOD4CiSzHxyK5PIL5JnJ/jqfS7VNSndpfFksW/FEwMZxGo/fdqn6KJB8pC8Y6N0GDDbaIDfLEsQYkpAQAJ5Y+xpo876g8IgQ+otTEBchiS20o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445071; c=relaxed/simple; bh=yKsBmsESsGcdPNhrpQn9NF43ra4r96XbhDcVtqpIWjI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=F+qQZVTM1wPsevC+u7FuTrwFJuOXDgC65ewfAIFs3icKYBAmnUsmJGH0PoLGYWpQVqjqpE0rD+ZcOg92h+907AcXRx851ewTcxfjbTYCOywKURU80VzLPcCOR4X0fT0p1boY2Jpwea3jotglYAHbjsUkW1lQHbJIHLqkPiKaBp0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk; spf=none smtp.mailfrom=ftp.linux.org.uk; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b=cOj5BjfG; arc=none smtp.client-ip=62.89.141.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ftp.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b="cOj5BjfG" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=a/B+4NoClzkcJyCwlXuszeYtFek+zdw7ynoQgbc1yRo=; b=cOj5BjfG5zf1vXTaOn0iQUn/8T Nem2ZYIzEMH1mhNINbZ+HwSZO7ZuaFrSzDeK2vli/hKHPxtY6slAEn8sZoDV6tV+2XJgHXNjI0Xt4 pYxtHNV8fNO+xgYwmLmHFfpFsM6uMA2AZHITZQu8kJDrdsKtbVSbd/ObB0bM210xl5ahhP8Wip2x2 LroaFCzyPy+ds9ic0MckI7WuZza2PDt+7Td5jqDlBoyRMy023yg80lBnTLzWW7QRr5Gmn4K8NEMWj KQQ5v1uOH0Uk8qwbWP4192RZUMk3DVzq//eXVh1etghDL7o8VxHbDg32kyCdgr1qrIYqw/TW8VmRT z3vzzTRQ==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.98 #2 (Red Hat Linux)) id 1sdOn6-000000010Uy-2qNr; Mon, 12 Aug 2024 06:44:28 +0000 From: Al Viro To: viro@zeniv.linux.org.uk Cc: brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org Subject: [PATCH 09/11] fs/file.c: add fast path in find_next_fd() Date: Mon, 12 Aug 2024 07:44:25 +0100 Message-ID: <20240812064427.240190-9-viro@zeniv.linux.org.uk> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240812064427.240190-1-viro@zeniv.linux.org.uk> References: <20240812064214.GH13701@ZenIV> <20240812064427.240190-1-viro@zeniv.linux.org.uk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Al Viro From: Yu Ma Skip 2-levels searching via find_next_zero_bit() when there is free slot in the word contains next_fd, as: (1) next_fd indicates the lower bound for the first free fd. (2) There is fast path inside of find_next_zero_bit() when size<=64 to speed up searching. (3) After fdt is expanded (the bitmap size doubled for each time of expansion), it would never be shrunk. The search size increases but there are few open fds available here. This fast path is proposed by Mateusz Guzik , and agreed by Jan Kara , which is more generic and scalable than previous versions. And on top of patch 1 and 2, it improves pts/blogbench-1.1.0 read by 8% and write by 4% on Intel ICX 160 cores configuration with v6.10-rc7. Reviewed-by: Jan Kara Reviewed-by: Tim Chen Signed-off-by: Yu Ma Link: https://lore.kernel.org/r/20240717145018.3972922-4-yu.ma@intel.com Signed-off-by: Christian Brauner Signed-off-by: Al Viro --- fs/file.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/fs/file.c b/fs/file.c index 0340c811b22a..1ee85e061ade 100644 --- a/fs/file.c +++ b/fs/file.c @@ -490,6 +490,15 @@ static unsigned int find_next_fd(struct fdtable *fdt, unsigned int start) unsigned int maxfd = fdt->max_fds; /* always multiple of BITS_PER_LONG */ unsigned int maxbit = maxfd / BITS_PER_LONG; unsigned int bitbit = start / BITS_PER_LONG; + unsigned int bit; + + /* + * Try to avoid looking at the second level bitmap + */ + bit = find_next_zero_bit(&fdt->open_fds[bitbit], BITS_PER_LONG, + start & (BITS_PER_LONG - 1)); + if (bit < BITS_PER_LONG) + return bit + bitbit * BITS_PER_LONG; bitbit = find_next_zero_bit(fdt->full_fds_bits, maxbit, bitbit) * BITS_PER_LONG; if (bitbit >= maxfd) From patchwork Mon Aug 12 06:44:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 13760203 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 52D8A15218A for ; Mon, 12 Aug 2024 06:44:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=62.89.141.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445073; cv=none; b=eYQBf5d9Z1vrm82Bn5K+TvWD86t8Tv1bEYa7LFBGy763myUEy9Ea9/kRncexgptLm6FRhv9IzyCJoawWZOD0tqPhJpgAizdW7yBnPxuYRbq2/elZCLoCq5sxBmhqH2QVUoNboOjUsDJDxjxjWz+cA04WLQfntzTkbll03UBaluo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445073; c=relaxed/simple; bh=F2Se9N03hfe05pzZiZXkcIav7w4FXbX5uM2r7BRXFac=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cTWPo/o5Ojnwq/Iwq+r2Zl1bjKxuXm8iMsg2sic+Jpeq+RbnMERZIIeQZNoFgxqTyoUte6/qQqJNZDH6sOyKFj4cDt2cFLcHbPNdf5ujwTtTG9ywxgA09yin/qACgbYmoPj5tfb0VjUG46Jgcw9lBlHRBQJLg3i0SBRJbbz3+N8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk; spf=none smtp.mailfrom=ftp.linux.org.uk; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b=gMpLmwOV; arc=none smtp.client-ip=62.89.141.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ftp.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b="gMpLmwOV" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=TW7pzd/pIyAH5wSkVfNt/lP9fBHMkFTso88usI0Achc=; b=gMpLmwOVcdPX0SmUnX2chvgCDJ zRI+DcNSCTmagFFncnm9lAlaLQA7cAsnGDWIqgEVzfGuKPBOA/3HhiP2sfqsXsEgzVntFBON3ou6z 7q7RNhK5+zAgJWAhxxR2sNgiYzpEEy/rrJZ5P9fRdZ2YOBUfZrSClcDpjW8BYEY4VzDmcl97hihRz PFwWS7VP4vtQWQOLyx0oC5605j/FJ3Y1Iqho2em0BWW6kbYRFiHu93Mw9rle3T5FqP2VkUcovPCGw RZlznJDOp8oGaZA5hHXiMhE2aL6aY5LaQj/z2GJig4JSvPpuqCEdF8zvI4aQ2bPvvvU2Jf5J7otiF ZSrG4cHg==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.98 #2 (Red Hat Linux)) id 1sdOn6-000000010V5-3BMF; Mon, 12 Aug 2024 06:44:28 +0000 From: Al Viro To: viro@zeniv.linux.org.uk Cc: brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org Subject: [PATCH 10/11] alloc_fdtable(): change calling conventions. Date: Mon, 12 Aug 2024 07:44:26 +0100 Message-ID: <20240812064427.240190-10-viro@zeniv.linux.org.uk> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240812064427.240190-1-viro@zeniv.linux.org.uk> References: <20240812064214.GH13701@ZenIV> <20240812064427.240190-1-viro@zeniv.linux.org.uk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Al Viro First of all, tell it how many slots do we want, not which slot is wanted. It makes one caller (dup_fd()) more straightforward and doesn't harm another (expand_fdtable()). Furthermore, make it return ERR_PTR() on failure rather than returning NULL. Simplifies the callers. Simplify the size calculation, while we are at it - note that we always have slots_wanted greater than BITS_PER_LONG. What the rules boil down to is * use the smallest power of two large enough to give us that many slots * on 32bit skip 64 and 128 - the minimal capacity we want there is 256 slots (i.e. 1Kb fd array). * on 64bit don't skip anything, the minimal capacity is 128 - and we'll never be asked for 64 or less. 128 slots means 1Kb fd array, again. * on 128bit, if that ever happens, don't skip anything - we'll never be asked for 128 or less, so the fd array allocation will be at least 2Kb. Signed-off-by: Al Viro Reviewed-by: Christian Brauner --- fs/file.c | 58 ++++++++++++++++++------------------------------------- 1 file changed, 19 insertions(+), 39 deletions(-) diff --git a/fs/file.c b/fs/file.c index 1ee85e061ade..01cef75ef132 100644 --- a/fs/file.c +++ b/fs/file.c @@ -89,18 +89,11 @@ static void copy_fdtable(struct fdtable *nfdt, struct fdtable *ofdt) * 'unsigned long' in some places, but simply because that is how the Linux * kernel bitmaps are defined to work: they are not "bits in an array of bytes", * they are very much "bits in an array of unsigned long". - * - * The ALIGN(nr, BITS_PER_LONG) here is for clarity: since we just multiplied - * by that "1024/sizeof(ptr)" before, we already know there are sufficient - * clear low bits. Clang seems to realize that, gcc ends up being confused. - * - * On a 128-bit machine, the ALIGN() would actually matter. In the meantime, - * let's consider it documentation (and maybe a test-case for gcc to improve - * its code generation ;) */ -static struct fdtable * alloc_fdtable(unsigned int nr) +static struct fdtable *alloc_fdtable(unsigned int slots_wanted) { struct fdtable *fdt; + unsigned int nr; void *data; /* @@ -110,20 +103,22 @@ static struct fdtable * alloc_fdtable(unsigned int nr) * the fdarray into comfortable page-tuned chunks: starting at 1024B * and growing in powers of two from there on. */ - nr /= (1024 / sizeof(struct file *)); - nr = roundup_pow_of_two(nr + 1); - nr *= (1024 / sizeof(struct file *)); - nr = ALIGN(nr, BITS_PER_LONG); + if (IS_ENABLED(CONFIG_32BIT) && slots_wanted < 256) + nr = 256; + else + nr = roundup_pow_of_two(slots_wanted); /* * Note that this can drive nr *below* what we had passed if sysctl_nr_open - * had been set lower between the check in expand_files() and here. Deal - * with that in caller, it's cheaper that way. + * had been set lower between the check in expand_files() and here. * * We make sure that nr remains a multiple of BITS_PER_LONG - otherwise * bitmaps handling below becomes unpleasant, to put it mildly... */ - if (unlikely(nr > sysctl_nr_open)) - nr = ((sysctl_nr_open - 1) | (BITS_PER_LONG - 1)) + 1; + if (unlikely(nr > sysctl_nr_open)) { + nr = round_down(sysctl_nr_open, BITS_PER_LONG); + if (nr < slots_wanted) + return ERR_PTR(-EMFILE); + } fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL_ACCOUNT); if (!fdt) @@ -152,7 +147,7 @@ static struct fdtable * alloc_fdtable(unsigned int nr) out_fdt: kfree(fdt); out: - return NULL; + return ERR_PTR(-ENOMEM); } /* @@ -169,7 +164,7 @@ static int expand_fdtable(struct files_struct *files, unsigned int nr) struct fdtable *new_fdt, *cur_fdt; spin_unlock(&files->file_lock); - new_fdt = alloc_fdtable(nr); + new_fdt = alloc_fdtable(nr + 1); /* make sure all fd_install() have seen resize_in_progress * or have finished their rcu_read_lock_sched() section. @@ -178,16 +173,8 @@ static int expand_fdtable(struct files_struct *files, unsigned int nr) synchronize_rcu(); spin_lock(&files->file_lock); - if (!new_fdt) - return -ENOMEM; - /* - * extremely unlikely race - sysctl_nr_open decreased between the check in - * caller and alloc_fdtable(). Cheaper to catch it here... - */ - if (unlikely(new_fdt->max_fds <= nr)) { - __free_fdtable(new_fdt); - return -EMFILE; - } + if (IS_ERR(new_fdt)) + return PTR_ERR(new_fdt); cur_fdt = files_fdtable(files); BUG_ON(nr < cur_fdt->max_fds); copy_fdtable(new_fdt, cur_fdt); @@ -357,16 +344,9 @@ struct files_struct *dup_fd(struct files_struct *oldf, unsigned int max_fds, int if (new_fdt != &newf->fdtab) __free_fdtable(new_fdt); - new_fdt = alloc_fdtable(open_files - 1); - if (!new_fdt) { - *errorp = -ENOMEM; - goto out_release; - } - - /* beyond sysctl_nr_open; nothing to do */ - if (unlikely(new_fdt->max_fds < open_files)) { - __free_fdtable(new_fdt); - *errorp = -EMFILE; + new_fdt = alloc_fdtable(open_files); + if (IS_ERR(new_fdt)) { + *errorp = PTR_ERR(new_fdt); goto out_release; } From patchwork Mon Aug 12 06:44:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 13760201 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 65C94152DF5 for ; Mon, 12 Aug 2024 06:44:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=62.89.141.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445072; cv=none; b=KRA3w0e4tRcmt5bxJCqY5oukvEX80dUi/4pFVQTwDyZL2cNffYkjyz0UswOAdQM08PHI0Jz1t4++tG2PhgqnbsN7gMlQWf3jMjIBZWn8RtL2FfCLvR8ACioICeii9qmJ8gz7jGcngLphKMjOt3F2lo+b0v0yu0B0qih2fs70P/o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723445072; c=relaxed/simple; bh=+RpMTmmO7c3bUnDElBDipk4sspVP+DpZqrtJtd2P6BM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Vv3jAdn3xm3jUn0/lmAh6O+r19Ns9iBH4JCg8zT4DgBTlsBee/O6wA+8QOolxCJRUMMnHnJ69mgL1PSfBfXjH5ulkddTyW3laUenbDCqKuFJSjUkrzc9gs9SNgN/u+hwDnJ1KV4LFdNT5SHaYaDk2tPa78oLt3gyzghKtnLW4ik= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk; spf=none smtp.mailfrom=ftp.linux.org.uk; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b=WUq2OssD; arc=none smtp.client-ip=62.89.141.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zeniv.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ftp.linux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linux.org.uk header.i=@linux.org.uk header.b="WUq2OssD" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=99b7UZHUl91kNq0APruQ+TjL5+upTiM8P8+AlXXDai0=; b=WUq2OssD2ARl035NDXquR6j00h 9mJRdBUjR9MCSzOYy2QWAyl/pZmAp5r1w/alkZuA2EO2yl/v1ShonFiqiPjecIRg5vII8z/qryPA7 qOPxEt750qDlA25EX39HoU1EqLv5wNPvjgXHmW2PZyaJkDinfTlx3zto5gLCxujOtdHTgExXvt8dM +v5HsuXC/ofT/ssZ1paVWeSVE57jfpbUf/SHmGF6IirBh71no7ewtcVd+0+CUX0Oe7mvglzAUcR4j JTt7knZzWAbAdJvl79k00YGlNyYyNkJRrHzq0cuL/9ECvaYO35Z+s3s70kBhL1G+q811CgctBj28d KFVAo+yw==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.98 #2 (Red Hat Linux)) id 1sdOn6-000000010VC-3iX4; Mon, 12 Aug 2024 06:44:28 +0000 From: Al Viro To: viro@zeniv.linux.org.uk Cc: brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org Subject: [PATCH 11/11] dup_fd(): change calling conventions Date: Mon, 12 Aug 2024 07:44:27 +0100 Message-ID: <20240812064427.240190-11-viro@zeniv.linux.org.uk> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240812064427.240190-1-viro@zeniv.linux.org.uk> References: <20240812064214.GH13701@ZenIV> <20240812064427.240190-1-viro@zeniv.linux.org.uk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Al Viro return ERR_PTR() on failure, get rid of errorp Signed-off-by: Al Viro Reviewed-by: Christian Brauner --- fs/file.c | 14 ++++---------- include/linux/fdtable.h | 2 +- kernel/fork.c | 26 ++++++++++++-------------- 3 files changed, 17 insertions(+), 25 deletions(-) diff --git a/fs/file.c b/fs/file.c index 01cef75ef132..b8b5b615d116 100644 --- a/fs/file.c +++ b/fs/file.c @@ -306,17 +306,16 @@ static unsigned int sane_fdtable_size(struct fdtable *fdt, unsigned int max_fds) * passed in files structure. * errorp will be valid only when the returned files_struct is NULL. */ -struct files_struct *dup_fd(struct files_struct *oldf, unsigned int max_fds, int *errorp) +struct files_struct *dup_fd(struct files_struct *oldf, unsigned int max_fds) { struct files_struct *newf; struct file **old_fds, **new_fds; unsigned int open_files, i; struct fdtable *old_fdt, *new_fdt; - *errorp = -ENOMEM; newf = kmem_cache_alloc(files_cachep, GFP_KERNEL); if (!newf) - goto out; + return ERR_PTR(-ENOMEM); atomic_set(&newf->count, 1); @@ -346,8 +345,8 @@ struct files_struct *dup_fd(struct files_struct *oldf, unsigned int max_fds, int new_fdt = alloc_fdtable(open_files); if (IS_ERR(new_fdt)) { - *errorp = PTR_ERR(new_fdt); - goto out_release; + kmem_cache_free(files_cachep, newf); + return ERR_CAST(new_fdt); } /* @@ -388,11 +387,6 @@ struct files_struct *dup_fd(struct files_struct *oldf, unsigned int max_fds, int rcu_assign_pointer(newf->fdt, new_fdt); return newf; - -out_release: - kmem_cache_free(files_cachep, newf); -out: - return NULL; } static struct fdtable *close_files(struct files_struct * files) diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index 42cadad89f99..b1a913a17d04 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -102,7 +102,7 @@ struct task_struct; void put_files_struct(struct files_struct *fs); int unshare_files(void); -struct files_struct *dup_fd(struct files_struct *, unsigned, int *) __latent_entropy; +struct files_struct *dup_fd(struct files_struct *, unsigned) __latent_entropy; void do_close_on_exec(struct files_struct *); int iterate_fd(struct files_struct *, unsigned, int (*)(const void *, struct file *, unsigned), diff --git a/kernel/fork.c b/kernel/fork.c index cc760491f201..67ab37db6400 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1754,33 +1754,30 @@ static int copy_files(unsigned long clone_flags, struct task_struct *tsk, int no_files) { struct files_struct *oldf, *newf; - int error = 0; /* * A background process may not have any files ... */ oldf = current->files; if (!oldf) - goto out; + return 0; if (no_files) { tsk->files = NULL; - goto out; + return 0; } if (clone_flags & CLONE_FILES) { atomic_inc(&oldf->count); - goto out; + return 0; } - newf = dup_fd(oldf, NR_OPEN_MAX, &error); - if (!newf) - goto out; + newf = dup_fd(oldf, NR_OPEN_MAX); + if (IS_ERR(newf)) + return PTR_ERR(newf); tsk->files = newf; - error = 0; -out: - return error; + return 0; } static int copy_sighand(unsigned long clone_flags, struct task_struct *tsk) @@ -3236,13 +3233,14 @@ int unshare_fd(unsigned long unshare_flags, unsigned int max_fds, struct files_struct **new_fdp) { struct files_struct *fd = current->files; - int error = 0; if ((unshare_flags & CLONE_FILES) && (fd && atomic_read(&fd->count) > 1)) { - *new_fdp = dup_fd(fd, max_fds, &error); - if (!*new_fdp) - return error; + *new_fdp = dup_fd(fd, max_fds); + if (IS_ERR(*new_fdp)) { + *new_fdp = NULL; + return PTR_ERR(new_fdp); + } } return 0;