diff mbox series

[v2,2/2] FUSE: Retire superblock on force unmount

Message ID 20220512082832.v2.2.I692165059274c30b59bed56940b54a573ccb46e4@changeid (mailing list archive)
State New, archived
Headers show
Series Prevent re-use of FUSE superblock after force unmount | expand

Commit Message

Daniil Lunev May 11, 2022, 10:29 p.m. UTC
Force unmount of FUSE severes the connection with the user space, even
if there are still open files. Subsequent remount tries to re-use the
superblock held by the open files, which is meaningless in the FUSE case
after disconnect - reused super block doesn't have userspace counterpart
attached to it and is incapable of doing any IO.

Signed-off-by: Daniil Lunev <dlunev@chromium.org>
---

Changes in v2:
- Use an exported function instead of directly modifying superblock

 fs/fuse/inode.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Al Viro May 17, 2022, 10:20 p.m. UTC | #1
On Thu, May 12, 2022 at 08:29:10AM +1000, Daniil Lunev wrote:
> Force unmount of FUSE severes the connection with the user space, even
> if there are still open files. Subsequent remount tries to re-use the
> superblock held by the open files, which is meaningless in the FUSE case
> after disconnect - reused super block doesn't have userspace counterpart
> attached to it and is incapable of doing any IO.

	Why not simply have those simply rejected by fuse_test_super()?
Looks like that would be much smaller and less invasive patch...
Confused...
Al Viro May 17, 2022, 10:32 p.m. UTC | #2
On Tue, May 17, 2022 at 10:20:06PM +0000, Al Viro wrote:
> On Thu, May 12, 2022 at 08:29:10AM +1000, Daniil Lunev wrote:
> > Force unmount of FUSE severes the connection with the user space, even
> > if there are still open files. Subsequent remount tries to re-use the
> > superblock held by the open files, which is meaningless in the FUSE case
> > after disconnect - reused super block doesn't have userspace counterpart
> > attached to it and is incapable of doing any IO.
> 
> 	Why not simply have those simply rejected by fuse_test_super()?
> Looks like that would be much smaller and less invasive patch...
> Confused...

... because Miklos had suggested that, apparently ;-/  I disagree -
that approach has more side effects.  "mount will skip that sucker" is,
AFAICS, the only effect of modiyfing test_super callback(s); yours, OTOH...

Note that generic_shutdown_super() is *not* called while superblock is
mounted anywhere.  And it doesn't get to eviction from the list while it still
has live dentries.  Or inodes, for that matter.

So this
        if (sb->s_bdi != &noop_backing_dev_info) {
		if (sb->s_iflags & SB_I_PERSB_BDI)
			bdi_unregister(sb->s_bdi);
		bdi_put(sb->s_bdi);
		sb->s_bdi = &noop_backing_dev_info;
	}
is almost certainly not safe to be done on a live superblock.
Daniil Lunev May 17, 2022, 11:27 p.m. UTC | #3
Hi Al,
Here is my v1 version of the patch which does skip the node in testing super
https://lore.kernel.org/linux-fsdevel/20220511013057.245827-1-dlunev@chromium.org/T/#u
I am not particular which version is to be used. However note, it still needs
to remove the bdi node since otherwise the new mount will have a conflict.
The bdi put should not kill the bdi if it has more references (and then
conflict would still occur, but chances of bdi being opened for a long time
is low AFAIU), thus this should be safe, but I might be missing something.
Note, that FUSE super block at the time is barely alive, it may have open
filehandles/references, but the connection with user space is already cut
and all of the ops would return transport error.
Let me know what you think and how we can address it better maybe?
Thanks,
Daniil
Miklos Szeredi May 18, 2022, 11:57 a.m. UTC | #4
On Wed, 18 May 2022 at 00:32, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Tue, May 17, 2022 at 10:20:06PM +0000, Al Viro wrote:
> > On Thu, May 12, 2022 at 08:29:10AM +1000, Daniil Lunev wrote:
> > > Force unmount of FUSE severes the connection with the user space, even
> > > if there are still open files. Subsequent remount tries to re-use the
> > > superblock held by the open files, which is meaningless in the FUSE case
> > > after disconnect - reused super block doesn't have userspace counterpart
> > > attached to it and is incapable of doing any IO.
> >
> >       Why not simply have those simply rejected by fuse_test_super()?
> > Looks like that would be much smaller and less invasive patch...
> > Confused...
>
> ... because Miklos had suggested that, apparently ;-/  I disagree -
> that approach has more side effects.  "mount will skip that sucker" is,
> AFAICS, the only effect of modiyfing test_super callback(s); yours, OTOH...

Yep, messing with the bdi doesn't look good.  Fuse always uses a
private bdi, so it's not even necessary.

Just removing from type->fs_supers should not have any side effects,
at least I can't spot any.

Fixing fuse_test_super() is not sufficient, as the fuseblk type goes
though get_tree_bdev().  That could be tweaked as well, but it would
end up with more complexity.

Thanks,
Miklos
Daniil Lunev May 18, 2022, 10:55 p.m. UTC | #5
> Yep, messing with the bdi doesn't look good.  Fuse always uses a
> private bdi, so it's not even necessary.

The reason I needed to remove the bdi is name collision - fuse
generates a fixed name for its bdi based on the underlying block
device. However, those collisions of mine were conducted on a
version prior to the private bdi introduction, I am not sure if that
is supposed to fix the collision issue. Need to check

Thanks,
Daniil
Daniil Lunev May 23, 2022, 12:25 a.m. UTC | #6
So, I tried this patchset with open bdi elements during force unmount
and a random file open [1], and didn't see any major drama with
force unmounting the node, after re-mounting, read on sysfs node
returned "no such device", which is expected.
With private bdi flag patch, unless bdi is unregister on force unmount
in fuse, it will complain on name collision [2] (because the patch
actually doesn't do much but unregisters the bdi on unmount, which
seems to happen ok even if node is busy).
Let me know if I am missing anything or if there are any other
concerns, and advise what would be the best way to move this
forward.

Thanks,
Daniil.


[1] Python shell
>>> f1 = open('/sys/class/bdi/8:0-fuseblk/read_ahead_kb', 'r')
>>> f2 = open('/media/removable/USB Drive/m1', 'w')

[2]
[  149.826508] sysfs: cannot create duplicate filename
'/devices/virtual/bdi/8:0-fuseblk'

On Thu, May 19, 2022 at 8:55 AM Daniil Lunev <dlunev@chromium.org> wrote:
>
> > Yep, messing with the bdi doesn't look good.  Fuse always uses a
> > private bdi, so it's not even necessary.
>
> The reason I needed to remove the bdi is name collision - fuse
> generates a fixed name for its bdi based on the underlying block
> device. However, those collisions of mine were conducted on a
> version prior to the private bdi introduction, I am not sure if that
> is supposed to fix the collision issue. Need to check
>
> Thanks,
> Daniil
Miklos Szeredi May 24, 2022, 2:22 p.m. UTC | #7
On Mon, 23 May 2022 at 02:25, Daniil Lunev <dlunev@chromium.org> wrote:
>
> So, I tried this patchset with open bdi elements during force unmount
> and a random file open [1], and didn't see any major drama with
> force unmounting the node, after re-mounting, read on sysfs node
> returned "no such device", which is expected.
> With private bdi flag patch, unless bdi is unregister on force unmount
> in fuse, it will complain on name collision [2] (because the patch
> actually doesn't do much but unregisters the bdi on unmount, which
> seems to happen ok even if node is busy).

Calling bdi_unregister() might be okay, and that should fix this.  I'm
not familiar enough with that part to say for sure.

But freeing sb->s_bdi while the superblock is active looks problematic.

Thanks,
Miklos
Daniil Lunev May 24, 2022, 10:44 p.m. UTC | #8
> Calling bdi_unregister() might be okay, and that should fix this.  I'm
> not familiar enough with that part to say for sure.
> But freeing sb->s_bdi while the superblock is active looks problematic.
Tracing the code, I see that, yes, in general that might not be safe to call
the "bdi_put" function for any FS - because it might have in-flight ops even
with force, where they will routinely access the members of the bdi struct
without locks. However, we do replace the struct with a no_op version,
and specifically for the FUSE case we sever the connection first, so no
in-flight ops can actually be there. And I am not sure if other FS may
need to do this retirement, given they don't have these
user-space/kernel split. It would make sense however to delay the actual
put till  the actual super block destruction, but that would require
introducing extra state tracking to see if the block is or is not registered
anymore. It can be piggybacked on the v1 patch where I have an explicit
state flag, but not on v2.
Miklos, Al, will the following work and be acceptable?
Get v1 patchset[1], in fuse_umount_begin do bdi_unregister and set
the flag, but do not do bdi_put or replacement with no_op. And then in
generic shutdown super if the bdi is not no_op and the 'defunct' flag is
set, skip unregister part.
Thanks,
Daniil

[1]
https://lore.kernel.org/linux-fsdevel/20220511013057.245827-1-dlunev@chromium.org/T/#u
Daniil Lunev May 30, 2022, 1:41 a.m. UTC | #9
I have prepared the v3 patch as described in my previous email. PTAL.

Thanks,
Daniil

On Wed, May 25, 2022 at 8:44 AM Daniil Lunev <dlunev@chromium.org> wrote:
>
> > Calling bdi_unregister() might be okay, and that should fix this.  I'm
> > not familiar enough with that part to say for sure.
> > But freeing sb->s_bdi while the superblock is active looks problematic.
> Tracing the code, I see that, yes, in general that might not be safe to call
> the "bdi_put" function for any FS - because it might have in-flight ops even
> with force, where they will routinely access the members of the bdi struct
> without locks. However, we do replace the struct with a no_op version,
> and specifically for the FUSE case we sever the connection first, so no
> in-flight ops can actually be there. And I am not sure if other FS may
> need to do this retirement, given they don't have these
> user-space/kernel split. It would make sense however to delay the actual
> put till  the actual super block destruction, but that would require
> introducing extra state tracking to see if the block is or is not registered
> anymore. It can be piggybacked on the v1 patch where I have an explicit
> state flag, but not on v2.
> Miklos, Al, will the following work and be acceptable?
> Get v1 patchset[1], in fuse_umount_begin do bdi_unregister and set
> the flag, but do not do bdi_put or replacement with no_op. And then in
> generic shutdown super if the bdi is not no_op and the 'defunct' flag is
> set, skip unregister part.
> Thanks,
> Daniil
>
> [1]
> https://lore.kernel.org/linux-fsdevel/20220511013057.245827-1-dlunev@chromium.org/T/#u
diff mbox series

Patch

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 8c0665c5dff88..8875361544b2a 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -476,8 +476,11 @@  static void fuse_umount_begin(struct super_block *sb)
 {
 	struct fuse_conn *fc = get_fuse_conn_super(sb);
 
-	if (!fc->no_force_umount)
-		fuse_abort_conn(fc);
+	if (fc->no_force_umount)
+		return;
+
+	fuse_abort_conn(fc);
+	retire_super(sb);
 }
 
 static void fuse_send_destroy(struct fuse_mount *fm)