Message ID | 87d75575e16637a84b82326d5c53cb78cdf9a7e0.1709991203.git.anand.jain@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: enhanced logic for stray single device | expand |
On Sat, Mar 09, 2024 at 07:14:31PM +0530, Anand Jain wrote: > Tempfsid assumes all registered single devices in the fs_devicies list are > to be mounted; otherwise, they won't be in the btrfs_device list. > > We recently fixed a related bug caused by leaving failed-open device in > the list. This triggered tempfsid activation upon subsequent mounts of the > same fsid wrongly. > > To prevent this, scan the entire device list at mount for any stray > device and free them in btrfs_scan_one_device(). Is this an additional precaution on top of maintaining an invariant on every umount/failed mount that we have freed stale devices of single device fs-es? Or is it fundamentally impossible for us to enforce that invariant? It feels like overkill to hack up free_stale_devices in this way, compared to just ensuring that we manage cleaning up single devices fs-es correctly when we are in a cleanup context. If this is practically the best way to ensure we don't get caught with our pants down by a random stale device, then I suppose it's fine. A total aside I just thought of: I think it might also make sense to consider adding logic to look for single device fs-es with a device->bdev that is stale from the block layer's perspective, and somehow marking those in a way that tempfsid cares about. That would help with things that like that case where we delete the block dev out from under a mounted fs and mount it a second time with tempfsid after it's recreated. Not a huge deal, as we've already discussed, though. > > Signed-off-by: Anand Jain <anand.jain@oracle.com> > --- > fs/btrfs/volumes.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 60d848392cd0..bb0857cfbef2 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -1382,6 +1382,8 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags, > > lockdep_assert_held(&uuid_mutex); > > + btrfs_free_stale_devices(0, NULL, true); > + > /* > * we would like to check all the supers, but that would make > * a btrfs mount succeed after a mkfs from a different FS. > -- > 2.38.1 >
On 3/14/24 22:41, Boris Burkov wrote: > On Sat, Mar 09, 2024 at 07:14:31PM +0530, Anand Jain wrote: >> Tempfsid assumes all registered single devices in the fs_devicies list are >> to be mounted; otherwise, they won't be in the btrfs_device list. >> >> We recently fixed a related bug caused by leaving failed-open device in >> the list. This triggered tempfsid activation upon subsequent mounts of the >> same fsid wrongly. >> >> To prevent this, scan the entire device list at mount for any stray >> device and free them in btrfs_scan_one_device(). > > Is this an additional precaution on top of maintaining an invariant on > every umount/failed mount that we have freed stale devices of single > device fs-es? Or is it fundamentally impossible for us to enforce that > invariant? > Hmm. That's the ultimate goal: maintaining such an invariant. However, there are bugs. So, this is the place where we can detect whether we are successful. If we aren't, then we can work around it by freeing the stale device and avoiding bad consequences. I think I should also include a warning message when we detect and free, so that it can be reviewed for the proper fix. Does that seem reasonable? > It feels like overkill to hack up free_stale_devices in this way, > compared to just ensuring that we manage cleaning up single devices > fs-es correctly when we are in a cleanup context. If this is practically > the best way to ensure we don't get caught with our pants down by a > random stale device, then I suppose it's fine. > > A total aside I just thought of: > I think it might also make sense to consider adding logic to look for > single device fs-es with a device->bdev that is stale from the block > layer's perspective, and somehow marking those in a way that tempfsid > cares about. How would we know if the block layer considers a certain device's block device (device->bdev) as stale? If you mention a Write IO failure, we already put the filesystem in read-only mode if that happens. But, we can't close the device due to the pending write. (Some operating systems have an option to call panic, which dumps the memory to the coredump device and reboots.). > That would help with things that like that case where we > delete the block dev out from under a mounted fs and mount it a second > time with tempfsid after it's recreated. Not a huge deal, as we've > already discussed, though. Yeah, thanks for brainstorming. Basically, we need a way to distinguish between the same physical-device with multiple nodes and different physical-devices with the same filesystem. Thanks, Anand > >> >> Signed-off-by: Anand Jain <anand.jain@oracle.com> >> --- >> fs/btrfs/volumes.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >> index 60d848392cd0..bb0857cfbef2 100644 >> --- a/fs/btrfs/volumes.c >> +++ b/fs/btrfs/volumes.c >> @@ -1382,6 +1382,8 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags, >> >> lockdep_assert_held(&uuid_mutex); >> >> + btrfs_free_stale_devices(0, NULL, true); >> + >> /* >> * we would like to check all the supers, but that would make >> * a btrfs mount succeed after a mkfs from a different FS. >> -- >> 2.38.1 >>
Boris, Gentle ping? Thanks. On 3/16/24 19:43, Anand Jain wrote: > On 3/14/24 22:41, Boris Burkov wrote: >> On Sat, Mar 09, 2024 at 07:14:31PM +0530, Anand Jain wrote: >>> Tempfsid assumes all registered single devices in the fs_devicies >>> list are >>> to be mounted; otherwise, they won't be in the btrfs_device list. >>> >>> We recently fixed a related bug caused by leaving failed-open device in >>> the list. This triggered tempfsid activation upon subsequent mounts >>> of the >>> same fsid wrongly. >>> >>> To prevent this, scan the entire device list at mount for any stray >>> device and free them in btrfs_scan_one_device(). >> >> Is this an additional precaution on top of maintaining an invariant on >> every umount/failed mount that we have freed stale devices of single >> device fs-es? Or is it fundamentally impossible for us to enforce that >> invariant? >> > > Hmm. That's the ultimate goal: maintaining such an invariant. However, > there are bugs. So, this is the place where we can detect whether we > are successful. If we aren't, then we can work around it by freeing > the stale device and avoiding bad consequences. > I think I should also include a warning message when we detect and > free, so that it can be reviewed for the proper fix. > Does that seem reasonable? > >> It feels like overkill to hack up free_stale_devices in this way, >> compared to just ensuring that we manage cleaning up single devices >> fs-es correctly when we are in a cleanup context. If this is practically >> the best way to ensure we don't get caught with our pants down by a >> random stale device, then I suppose it's fine. >> >> A total aside I just thought of: >> I think it might also make sense to consider adding logic to look for >> single device fs-es with a device->bdev that is stale from the block >> layer's perspective, and somehow marking those in a way that tempfsid >> cares about. > > > How would we know if the block layer considers a certain device's > block device (device->bdev) as stale? > > If you mention a Write IO failure, we already put the filesystem > in read-only mode if that happens. But, we can't close the device > due to the pending write. (Some operating systems have an option > to call panic, which dumps the memory to the coredump device and > reboots.). > >> That would help with things that like that case where we >> delete the block dev out from under a mounted fs and mount it a second >> time with tempfsid after it's recreated. Not a huge deal, as we've >> already discussed, though. > > Yeah, thanks for brainstorming. Basically, we need a way to distinguish > between the same physical-device with multiple nodes and different > physical-devices with the same filesystem. > > Thanks, Anand > >> >>> >>> Signed-off-by: Anand Jain <anand.jain@oracle.com> >>> --- >>> fs/btrfs/volumes.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >>> index 60d848392cd0..bb0857cfbef2 100644 >>> --- a/fs/btrfs/volumes.c >>> +++ b/fs/btrfs/volumes.c >>> @@ -1382,6 +1382,8 @@ struct btrfs_device >>> *btrfs_scan_one_device(const char *path, blk_mode_t flags, >>> lockdep_assert_held(&uuid_mutex); >>> + btrfs_free_stale_devices(0, NULL, true); >>> + >>> /* >>> * we would like to check all the supers, but that would make >>> * a btrfs mount succeed after a mkfs from a different FS. >>> -- >>> 2.38.1 >>> >
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 60d848392cd0..bb0857cfbef2 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1382,6 +1382,8 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags, lockdep_assert_held(&uuid_mutex); + btrfs_free_stale_devices(0, NULL, true); + /* * we would like to check all the supers, but that would make * a btrfs mount succeed after a mkfs from a different FS.
Tempfsid assumes all registered single devices in the fs_devicies list are to be mounted; otherwise, they won't be in the btrfs_device list. We recently fixed a related bug caused by leaving failed-open device in the list. This triggered tempfsid activation upon subsequent mounts of the same fsid wrongly. To prevent this, scan the entire device list at mount for any stray device and free them in btrfs_scan_one_device(). Signed-off-by: Anand Jain <anand.jain@oracle.com> --- fs/btrfs/volumes.c | 2 ++ 1 file changed, 2 insertions(+)