Message ID | 752b8526be21d984e0ee58c7f66d312664ff5ac5.1709256891.git.anand.jain@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: validate device maj:min during open | expand |
On Fri, Mar 01, 2024 at 07:21:32AM +0530, Anand Jain wrote: > Boris managed to create a device capable of changing its maj:min without > altering its device path. > > Only multi-devices can be scanned. A device that gets scanned and remains > in the Btrfs kernel cache might end up with an incorrect maj:min. > > Despite the tempfsid feature patch did not introduce this bug, it could > lead to issues if the above multi-device is converted to a single device > with a stale maj:min. Subsequently, attempting to mount the same device > with the correct maj:min might mistake it for another device with the same > fsid, potentially resulting in wrongly auto-enabling the tempfsid feature. > > To address this, this patch validates the device's maj:min at the time of > device open and updates it if it has changed since the last scan. I do believe this patch is correct for fixing my test case, but I don't believe that it is the proper fix for this issue. This is just plugging one more hole in a leaky dam. > > Fixes: a5b8a5f9f835 ("btrfs: support cloned-device mount capability") > Reported-by: Boris Burkov <boris@bur.io> > Co-developed-by: Boris Burkov <boris@bur.io> > Signed-off-by: Anand Jain <anand.jain@oracle.com> > --- > fs/btrfs/volumes.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index deb4f191730d..4c498f088302 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -644,6 +644,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, > struct bdev_handle *bdev_handle; > struct btrfs_super_block *disk_super; > u64 devid; > + dev_t devt; > int ret; > > if (device->bdev) > @@ -692,6 +693,24 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, > device->bdev = bdev_handle->bdev; > clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state); > > + ret = lookup_bdev(device->name->str, &devt); It should be fine to just use the dev_t in bdev_handle->bdev->bd_dev. > + if (ret) { > + btrfs_err(NULL, > + "failed to validate (%d:%d) maj:min for device %s %d resetting to 0", > + MAJOR(device->devt), MINOR(device->devt), > + device->name->str, ret); > + device->devt = 0; > + } else { > + if (device->devt != devt) { > + btrfs_warn(NULL, > + "device %s maj:min changed from %d:%d to %d:%d", > + device->name->str, MAJOR(device->devt), > + MINOR(device->devt), MAJOR(devt), > + MINOR(devt)); > + device->devt = devt; > + } > + } > + > fs_devices->open_devices++; > if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) && > device->devid != BTRFS_DEV_REPLACE_DEVID) { > -- > 2.38.1 >
On 3/1/24 21:04, Boris Burkov wrote: > On Fri, Mar 01, 2024 at 07:21:32AM +0530, Anand Jain wrote: >> Boris managed to create a device capable of changing its maj:min without >> altering its device path. >> >> Only multi-devices can be scanned. A device that gets scanned and remains >> in the Btrfs kernel cache might end up with an incorrect maj:min. >> >> Despite the tempfsid feature patch did not introduce this bug, it could >> lead to issues if the above multi-device is converted to a single device >> with a stale maj:min. Subsequently, attempting to mount the same device >> with the correct maj:min might mistake it for another device with the same >> fsid, potentially resulting in wrongly auto-enabling the tempfsid feature. >> >> To address this, this patch validates the device's maj:min at the time of >> device open and updates it if it has changed since the last scan. > > I do believe this patch is correct for fixing my test case, Right. It fixes the bug reported in the testcase only. > but I don't > believe that it is the proper fix for this issue. Indeed, I anticipated that it might be confusing. As I've clarified elsewhere, this isn't the solution for the already known multi-nodes single-device issue. The resolution is currently being discussed. I think I've come up with a better idea now. I'll send out the proposal soon for discussions. However, my first challenge is to simulate a multi-nodes single-device environment for testing. Thx, Anand > This is just plugging > one more hole in a leaky dam. >> Fixes: a5b8a5f9f835 ("btrfs: support cloned-device mount capability") >> Reported-by: Boris Burkov <boris@bur.io> >> Co-developed-by: Boris Burkov <boris@bur.io> >> Signed-off-by: Anand Jain <anand.jain@oracle.com> >> --- >> fs/btrfs/volumes.c | 19 +++++++++++++++++++ >> 1 file changed, 19 insertions(+) >> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >> index deb4f191730d..4c498f088302 100644 >> --- a/fs/btrfs/volumes.c >> +++ b/fs/btrfs/volumes.c >> @@ -644,6 +644,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, >> struct bdev_handle *bdev_handle; >> struct btrfs_super_block *disk_super; >> u64 devid; >> + dev_t devt; >> int ret; >> >> if (device->bdev) >> @@ -692,6 +693,24 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, >> device->bdev = bdev_handle->bdev; >> clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state); >> >> + ret = lookup_bdev(device->name->str, &devt); > > It should be fine to just use the dev_t in bdev_handle->bdev->bd_dev. > >> + if (ret) { >> + btrfs_err(NULL, >> + "failed to validate (%d:%d) maj:min for device %s %d resetting to 0", >> + MAJOR(device->devt), MINOR(device->devt), >> + device->name->str, ret); >> + device->devt = 0; >> + } else { >> + if (device->devt != devt) { >> + btrfs_warn(NULL, >> + "device %s maj:min changed from %d:%d to %d:%d", >> + device->name->str, MAJOR(device->devt), >> + MINOR(device->devt), MAJOR(devt), >> + MINOR(devt)); >> + device->devt = devt; >> + } >> + } >> + >> fs_devices->open_devices++; >> if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) && >> device->devid != BTRFS_DEV_REPLACE_DEVID) { >> -- >> 2.38.1 >>
I missed a comment. >> @@ -692,6 +693,24 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, >> device->bdev = bdev_handle->bdev; >> clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state); >> >> + ret = lookup_bdev(device->name->str, &devt); > > It should be fine to just use the dev_t in bdev_handle->bdev->bd_dev. Oh. That's much improved. I'll fix. Thanks, -Anand
On Fri, Mar 01, 2024 at 10:17:41PM +0530, Anand Jain wrote: > > > On 3/1/24 21:04, Boris Burkov wrote: > > On Fri, Mar 01, 2024 at 07:21:32AM +0530, Anand Jain wrote: > > > Boris managed to create a device capable of changing its maj:min without > > > altering its device path. > > > > > > Only multi-devices can be scanned. A device that gets scanned and remains > > > in the Btrfs kernel cache might end up with an incorrect maj:min. > > > > > > Despite the tempfsid feature patch did not introduce this bug, it could > > > lead to issues if the above multi-device is converted to a single device > > > with a stale maj:min. Subsequently, attempting to mount the same device > > > with the correct maj:min might mistake it for another device with the same > > > fsid, potentially resulting in wrongly auto-enabling the tempfsid feature. > > > > > > To address this, this patch validates the device's maj:min at the time of > > > device open and updates it if it has changed since the last scan. > > > > > I do believe this patch is correct for fixing my test case, > > Right. It fixes the bug reported in the testcase only. > > > but I don't > > believe that it is the proper fix for this issue. > > Indeed, I anticipated that it might be confusing. As I've clarified > elsewhere, this isn't the solution for the already known multi-nodes > single-device issue. The resolution is currently being discussed. I'm not familiar with the the multi-nodes single-device issue. My point was that while I agree that this fix works for the issue I do know about, an invalid devt in the btrfs device cache, I do not like it as a matter of taste. We have an invalid cache and instead of invalidating it in a general way at the point of invalidation, you are proposing that we just fix it up when we happen to notice. I believe that is a fragile solution that will simply leave the door open for more bugs in the future. (if not present). If we do go this way, at the very least I think we should do something a bit more general. Spitballing here, but perhaps we can get rid of all redundant caching of random values like devt in btrfs_device and just use the contents of the bdev struct as it is currently in the block layer. Or we need a way to treat a btrfs_device as invalid unless open_one_device has run on it. > > I think I've come up with a better idea now. I'll send out the > proposal soon for discussions. > > However, my first challenge is to simulate a multi-nodes > single-device environment for testing. > > Thx, Anand > > > This is just plugging > > one more hole in a leaky dam. > > > > Fixes: a5b8a5f9f835 ("btrfs: support cloned-device mount capability") > > > Reported-by: Boris Burkov <boris@bur.io> > > > Co-developed-by: Boris Burkov <boris@bur.io> > > > Signed-off-by: Anand Jain <anand.jain@oracle.com> > > > --- > > > fs/btrfs/volumes.c | 19 +++++++++++++++++++ > > > 1 file changed, 19 insertions(+) > > > > > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > > > index deb4f191730d..4c498f088302 100644 > > > --- a/fs/btrfs/volumes.c > > > +++ b/fs/btrfs/volumes.c > > > @@ -644,6 +644,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, > > > struct bdev_handle *bdev_handle; > > > struct btrfs_super_block *disk_super; > > > u64 devid; > > > + dev_t devt; > > > int ret; > > > if (device->bdev) > > > @@ -692,6 +693,24 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, > > > device->bdev = bdev_handle->bdev; > > > clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state); > > > + ret = lookup_bdev(device->name->str, &devt); > > > > It should be fine to just use the dev_t in bdev_handle->bdev->bd_dev. > > > > > + if (ret) { > > > + btrfs_err(NULL, > > > + "failed to validate (%d:%d) maj:min for device %s %d resetting to 0", > > > + MAJOR(device->devt), MINOR(device->devt), > > > + device->name->str, ret); > > > + device->devt = 0; > > > + } else { > > > + if (device->devt != devt) { > > > + btrfs_warn(NULL, > > > + "device %s maj:min changed from %d:%d to %d:%d", > > > + device->name->str, MAJOR(device->devt), > > > + MINOR(device->devt), MAJOR(devt), > > > + MINOR(devt)); > > > + device->devt = devt; > > > + } > > > + } > > > + > > > fs_devices->open_devices++; > > > if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) && > > > device->devid != BTRFS_DEV_REPLACE_DEVID) { > > > -- > > > 2.38.1 > > >
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index deb4f191730d..4c498f088302 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -644,6 +644,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, struct bdev_handle *bdev_handle; struct btrfs_super_block *disk_super; u64 devid; + dev_t devt; int ret; if (device->bdev) @@ -692,6 +693,24 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, device->bdev = bdev_handle->bdev; clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state); + ret = lookup_bdev(device->name->str, &devt); + if (ret) { + btrfs_err(NULL, + "failed to validate (%d:%d) maj:min for device %s %d resetting to 0", + MAJOR(device->devt), MINOR(device->devt), + device->name->str, ret); + device->devt = 0; + } else { + if (device->devt != devt) { + btrfs_warn(NULL, + "device %s maj:min changed from %d:%d to %d:%d", + device->name->str, MAJOR(device->devt), + MINOR(device->devt), MAJOR(devt), + MINOR(devt)); + device->devt = devt; + } + } + fs_devices->open_devices++; if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) && device->devid != BTRFS_DEV_REPLACE_DEVID) {
Boris managed to create a device capable of changing its maj:min without altering its device path. Only multi-devices can be scanned. A device that gets scanned and remains in the Btrfs kernel cache might end up with an incorrect maj:min. Despite the tempfsid feature patch did not introduce this bug, it could lead to issues if the above multi-device is converted to a single device with a stale maj:min. Subsequently, attempting to mount the same device with the correct maj:min might mistake it for another device with the same fsid, potentially resulting in wrongly auto-enabling the tempfsid feature. To address this, this patch validates the device's maj:min at the time of device open and updates it if it has changed since the last scan. Fixes: a5b8a5f9f835 ("btrfs: support cloned-device mount capability") Reported-by: Boris Burkov <boris@bur.io> Co-developed-by: Boris Burkov <boris@bur.io> Signed-off-by: Anand Jain <anand.jain@oracle.com> --- fs/btrfs/volumes.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)