[2/2] proc: Protect readers of /proc/mounts from remount

Message ID	20181211172423.7709-3-jack@suse.cz (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-fsdevel-owner@kernel.org> From: Jan Kara <jack@suse.cz> To: Al Viro <viro@ZenIV.linux.org.uk> Cc: <linux-fsdevel@vger.kernel.org>, Jan Kara <jack@suse.cz> Subject: [PATCH 2/2] proc: Protect readers of /proc/mounts from remount Date: Tue, 11 Dec 2018 18:24:23 +0100 Message-Id: <20181211172423.7709-3-jack@suse.cz> In-Reply-To: <20181211172423.7709-1-jack@suse.cz> References: <20181211172423.7709-1-jack@suse.cz> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk
Series	vfs: Fix crashes when reading /proc/mounts \| expand [0/2,RESEND] vfs: Fix crashes when reading /proc/mounts [1/2] vfs: Provide function to grab superblock reference [2/2] proc: Protect readers of /proc/mounts from remount

Jan Kara Dec. 11, 2018, 5:24 p.m. UTC

Readers of /proc/mounts (and similar files) are currently unprotected
from concurrently running remount on the filesystem they are reporting.
This can not only result in bogus reported information but also in
confusion in filesystems ->show_options callbacks resulting in
use-after-free issues or similar (for example ext4 handling of quota
file names is prone to such races).

Fix the problem by protecting showing of mount information with
sb->s_umount semaphore.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/mount.h     |  1 +
 fs/namespace.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 56 insertions(+), 4 deletions(-)

Al Viro Dec. 11, 2018, 6:36 p.m. UTC | #1

On Tue, Dec 11, 2018 at 06:24:23PM +0100, Jan Kara wrote:
> Readers of /proc/mounts (and similar files) are currently unprotected
> from concurrently running remount on the filesystem they are reporting.
> This can not only result in bogus reported information but also in
> confusion in filesystems ->show_options callbacks resulting in
> use-after-free issues or similar (for example ext4 handling of quota
> file names is prone to such races).
> 
> Fix the problem by protecting showing of mount information with
> sb->s_umount semaphore.

Umm...  Which tree is it against and what exactly does your hold_sb() do?

Al Viro Dec. 11, 2018, 6:37 p.m. UTC | #2

On Tue, Dec 11, 2018 at 06:36:19PM +0000, Al Viro wrote:
> On Tue, Dec 11, 2018 at 06:24:23PM +0100, Jan Kara wrote:
> > Readers of /proc/mounts (and similar files) are currently unprotected
> > from concurrently running remount on the filesystem they are reporting.
> > This can not only result in bogus reported information but also in
> > confusion in filesystems ->show_options callbacks resulting in
> > use-after-free issues or similar (for example ext4 handling of quota
> > file names is prone to such races).
> > 
> > Fix the problem by protecting showing of mount information with
> > sb->s_umount semaphore.
> 
> Umm...  Which tree is it against and what exactly does your hold_sb() do?

D'oh... I need more coffee.  Nevermind.

Al Viro Dec. 11, 2018, 6:58 p.m. UTC | #3

On Tue, Dec 11, 2018 at 06:24:23PM +0100, Jan Kara wrote:
> Readers of /proc/mounts (and similar files) are currently unprotected
> from concurrently running remount on the filesystem they are reporting.
> This can not only result in bogus reported information but also in
> confusion in filesystems ->show_options callbacks resulting in
> use-after-free issues or similar (for example ext4 handling of quota
> file names is prone to such races).
> 
> Fix the problem by protecting showing of mount information with
> sb->s_umount semaphore.

> +static bool mounts_trylock_super(struct proc_mounts *p, struct super_block *sb)
> +{
> +	if (p->locked_sb == sb)
> +		return true;
> +	if (p->locked_sb) {
> +		drop_super(p->locked_sb);
> +		p->locked_sb = NULL;
> +	}
> +	if (down_read_trylock(&sb->s_umount)) {
> +		hold_sb(sb);
> +		p->locked_sb = sb;
> +		return true;
> +	}
> +	return false;
> +}

Bad calling conventions, and you are paying for those with making
hold_sb() non-static (and having it, in the first place).

> +	if (mounts_trylock_super(p, sb))
> +		return p->cached_mount;
> +	/*
> +	 * Trylock failed. Since namepace_sem ranks below s_umount (through
> +	 * sb->s_umount > dir->i_rwsem > namespace_sem in the mount path), we
> +	 * have to drop it, wait for s_umount and then try again to guarantee
> +	 * forward progress.
> +	 */
> +	hold_sb(sb);

That.  Just hoist that hold_sb() into your trylock (and put it before the
down_read_trylock() there, while we are at it).  And turn the other caller
into
	if (unlikely(!.....))
		ret = -EAGAIN;
	else
		ret = p->show(m, &r->mnt);
followed by unconditional drop_super().  And I would probably go for
	mount_trylock_super(&p->locked_super, sb)
while we are at it, so that it's isolated from proc_mounts and can
be moved to fs/super.c


> +	up_read(&namespace_sem);
> +	down_read(&sb->s_umount);
> +	/*
> +	 * Sb may be dead by now but that just means we won't find it in any
> +	 * mount and drop lock & reference anyway.
> +	 */
> +	p->locked_sb = sb;
> +	goto restart;

No.
	if (likely(sb->s_root))
		p->locked_sb = sb;
	else
		drop_super(sb);
	goto restart;

Al Viro Dec. 11, 2018, 7:14 p.m. UTC | #4

On Tue, Dec 11, 2018 at 06:58:31PM +0000, Al Viro wrote:

> > +static bool mounts_trylock_super(struct proc_mounts *p, struct super_block *sb)
> > +{
> > +	if (p->locked_sb == sb)
> > +		return true;
> > +	if (p->locked_sb) {
> > +		drop_super(p->locked_sb);
> > +		p->locked_sb = NULL;
> > +	}
> > +	if (down_read_trylock(&sb->s_umount)) {
> > +		hold_sb(sb);
> > +		p->locked_sb = sb;
> > +		return true;
> > +	}
> > +	return false;
> > +}
> 
> Bad calling conventions, and you are paying for those with making
> hold_sb() non-static (and having it, in the first place).
> 
> > +	if (mounts_trylock_super(p, sb))
> > +		return p->cached_mount;
> > +	/*
> > +	 * Trylock failed. Since namepace_sem ranks below s_umount (through
> > +	 * sb->s_umount > dir->i_rwsem > namespace_sem in the mount path), we
> > +	 * have to drop it, wait for s_umount and then try again to guarantee
> > +	 * forward progress.
> > +	 */
> > +	hold_sb(sb);
> 
> That.  Just hoist that hold_sb() into your trylock (and put it before the
> down_read_trylock() there, while we are at it).  And turn the other caller
> into
> 	if (unlikely(!.....))
> 		ret = -EAGAIN;
> 	else
> 		ret = p->show(m, &r->mnt);
> followed by unconditional drop_super().  And I would probably go for
> 	mount_trylock_super(&p->locked_super, sb)
> while we are at it, so that it's isolated from proc_mounts and can
> be moved to fs/super.c

Looking at it some more...  I still hate it ;-/  Take a look at traverse()
in fs/seq_file.c and think what kind of clusterfuck will it cause...

Jan Kara Dec. 12, 2018, 12:56 p.m. UTC | #5

On Tue 11-12-18 19:14:52, Al Viro wrote:
> On Tue, Dec 11, 2018 at 06:58:31PM +0000, Al Viro wrote:
> 
> > > +static bool mounts_trylock_super(struct proc_mounts *p, struct super_block *sb)
> > > +{
> > > +	if (p->locked_sb == sb)
> > > +		return true;
> > > +	if (p->locked_sb) {
> > > +		drop_super(p->locked_sb);
> > > +		p->locked_sb = NULL;
> > > +	}
> > > +	if (down_read_trylock(&sb->s_umount)) {
> > > +		hold_sb(sb);
> > > +		p->locked_sb = sb;
> > > +		return true;
> > > +	}
> > > +	return false;
> > > +}
> > 
> > Bad calling conventions, and you are paying for those with making
> > hold_sb() non-static (and having it, in the first place).
> > 
> > > +	if (mounts_trylock_super(p, sb))
> > > +		return p->cached_mount;
> > > +	/*
> > > +	 * Trylock failed. Since namepace_sem ranks below s_umount (through
> > > +	 * sb->s_umount > dir->i_rwsem > namespace_sem in the mount path), we
> > > +	 * have to drop it, wait for s_umount and then try again to guarantee
> > > +	 * forward progress.
> > > +	 */
> > > +	hold_sb(sb);
> > 
> > That.  Just hoist that hold_sb() into your trylock (and put it before the
> > down_read_trylock() there, while we are at it).  And turn the other caller
> > into
> > 	if (unlikely(!.....))
> > 		ret = -EAGAIN;
> > 	else
> > 		ret = p->show(m, &r->mnt);
> > followed by unconditional drop_super().  And I would probably go for
> > 	mount_trylock_super(&p->locked_super, sb)
> > while we are at it, so that it's isolated from proc_mounts and can
> > be moved to fs/super.c
> 
> Looking at it some more...  I still hate it ;-/  Take a look at traverse()
> in fs/seq_file.c and think what kind of clusterfuck will it cause...

I guess you mean that in case we fail to lock s_umount semaphore, we'll
return -EAGAIN and traverse() will abort? That is true but since we return
-EAGAIN, callers will call into traverse() again - both do:

while ((err = traverse(m, *ppos)) == -EAGAIN) ;

and then in m_start() we will do the blocking lock on s_umount. I agree it
is ugly and twisted but it should be rare...

Now looking at the code, maybe we could avoid this weird retry dance with
traverse(). Something like following in m_show():

        sb = mnt->mnt_sb;
	if (mount_trylock_super())
		show and done
	get passive sb reference
	namespace_unlock();
	down_read(&sb->s_umount);
	namespace_lock();
	new_mnt = seq_list_start();
	if (new_mnt != mnt)
		retry
	show and done

This could be handled completely inside m_show() so no strange retry dance
with traverse(). Do you find this better?

								Honza

[2/2] proc: Protect readers of /proc/mounts from remount

Commit Message

Comments

Patch