diff mbox series

[v2] fs: consistently deref the files table with rcu_dereference_raw()

Message ID 20250313135725.1320914-1-mjguzik@gmail.com (mailing list archive)
State New
Headers show
Series [v2] fs: consistently deref the files table with rcu_dereference_raw() | expand

Commit Message

Mateusz Guzik March 13, 2025, 1:57 p.m. UTC
... except when the table is known to be only used by one thread.

A file pointer can get installed at any moment despite the ->file_lock
being held since the following:
8a81252b774b53e6 ("fs/file.c: don't acquire files->file_lock in fd_install()")

Accesses subject to such a race can in principle suffer load tearing.

While here redo the comment in dup_fd -- it only covered a race against
files showing up, still assuming fd_install() takes the lock.

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---

v2: s/rcu_access_pointer/rcu_dererence_raw/

I confirmed the possiblity of the problem with this:
https://lwn.net/Articles/793253/#Load%20Tearing

Granted, the article being 6 years old might mean some magic was added
by now to prevent this particular problem.

While technically this classifies as a bugfix, given that nothing blew
up and this is more of a "just in case" change, I don't think this
warrants any backports. Thus I'm not adding a Fixes: tag to prevent this
from being picked by autosel.

 fs/file.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

Comments

Christian Brauner March 14, 2025, 3:40 p.m. UTC | #1
On Thu, 13 Mar 2025 14:57:25 +0100, Mateusz Guzik wrote:
> ... except when the table is known to be only used by one thread.
> 
> A file pointer can get installed at any moment despite the ->file_lock
> being held since the following:
> 8a81252b774b53e6 ("fs/file.c: don't acquire files->file_lock in fd_install()")
> 
> Accesses subject to such a race can in principle suffer load tearing.
> 
> [...]

Applied to the vfs-6.15.misc branch of the vfs/vfs.git tree.
Patches in the vfs-6.15.misc branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs-6.15.misc

[1/1] fs: consistently deref the files table with rcu_dereference_raw()
      https://git.kernel.org/vfs/vfs/c/c72b20d10034
diff mbox series

Patch

diff --git a/fs/file.c b/fs/file.c
index 6c159ede55f1..58ff50094525 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -423,17 +423,25 @@  struct files_struct *dup_fd(struct files_struct *oldf, struct fd_range *punch_ho
 	old_fds = old_fdt->fd;
 	new_fds = new_fdt->fd;
 
+	/*
+	 * We may be racing against fd allocation from other threads using this
+	 * files_struct, despite holding ->file_lock.
+	 *
+	 * alloc_fd() might have already claimed a slot, while fd_install()
+	 * did not populate it yet. Note the latter operates locklessly, so
+	 * the file can show up as we are walking the array below.
+	 *
+	 * At the same time we know no files will disappear as all other
+	 * operations take the lock.
+	 *
+	 * Instead of trying to placate userspace racing with itself, we
+	 * ref the file if we see it and mark the fd slot as unused otherwise.
+	 */
 	for (i = open_files; i != 0; i--) {
-		struct file *f = *old_fds++;
+		struct file *f = rcu_dereference_raw(*old_fds++);
 		if (f) {
 			get_file(f);
 		} else {
-			/*
-			 * The fd may be claimed in the fd bitmap but not yet
-			 * instantiated in the files array if a sibling thread
-			 * is partway through open().  So make sure that this
-			 * fd is available to the new process.
-			 */
 			__clear_open_fd(open_files - i, new_fdt);
 		}
 		rcu_assign_pointer(*new_fds++, f);
@@ -684,7 +692,7 @@  struct file *file_close_fd_locked(struct files_struct *files, unsigned fd)
 		return NULL;
 
 	fd = array_index_nospec(fd, fdt->max_fds);
-	file = fdt->fd[fd];
+	file = rcu_dereference_raw(fdt->fd[fd]);
 	if (file) {
 		rcu_assign_pointer(fdt->fd[fd], NULL);
 		__put_unused_fd(files, fd);
@@ -1252,7 +1260,7 @@  __releases(&files->file_lock)
 	 */
 	fdt = files_fdtable(files);
 	fd = array_index_nospec(fd, fdt->max_fds);
-	tofree = fdt->fd[fd];
+	tofree = rcu_dereference_raw(fdt->fd[fd]);
 	if (!tofree && fd_is_open(fd, fdt))
 		goto Ebusy;
 	get_file(file);