diff mbox

[V3] Btrfs: device_list_add() should not update list when mounted

Message ID 1402025201-17140-1-git-send-email-anand.jain@oracle.com (mailing list archive)
State Superseded
Headers show

Commit Message

Anand Jain June 6, 2014, 3:26 a.m. UTC
(looks like there was some sendmail problem I don't see this in the btrfs list,
sending again. sorry for multiple copies if any).

device_list_add() is called when user runs btrfs dev scan, which would add
any btrfs device into the btrfs_fs_devices list.

Now think of a mounted btrfs. And a new device which contains the a SB
from the mounted btrfs devices.

In this situation when user runs btrfs dev scan, the current code would
just replace existing device with the new device.

Which is to note that old device is neither closed nor gracefully
removed from the btrfs.

The FS is still operational with the old bdev however the device name
is the btrfs_device is new which is provided by the btrfs dev scan.

reproducer:

devmgt[1] detach /dev/sdc

replace the missing disk /dev/sdc

btrfs rep start -f 1 /dev/sde /btrfs
Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120
        Total devices 2 FS bytes used 32.00KiB
        devid    1 size 958.94MiB used 115.88MiB path /dev/sde
        devid    2 size 958.94MiB used 103.88MiB path /dev/sdd

make /dev/sdc to reappear

devmgt attach host2

btrfs dev scan

btrfs fi show -m
Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120^M
        Total devices 2 FS bytes used 32.00KiB^M
        devid    1 size 958.94MiB used 115.88MiB path /dev/sdc <- Wrong.
        devid    2 size 958.94MiB used 103.88MiB path /dev/sdd

since /dev/sdc has been replaced with /dev/sde, the /dev/sdc shouldn't be
part of the btrfs-fsid when it reappears. If user want it to be part of it
then sys admin should be using btrfs device add instead.

[1] github.com/anajain/devmgt.git

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v2->v3: simpler commit and comment message
v1->v2: remove '---' in commit messages which conflict with git am

 fs/btrfs/volumes.c | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

Comments

Qu Wenruo June 9, 2014, 1:52 a.m. UTC | #1
-------- Original Message --------
Subject: [PATCH V3] Btrfs: device_list_add() should not update list when 
mounted
From: Anand Jain <anand.jain@oracle.com>
To: linux-btrfs@vger.kernel.org
Date: 2014?06?06? 11:26
> (looks like there was some sendmail problem I don't see this in the btrfs list,
> sending again. sorry for multiple copies if any).
>
> device_list_add() is called when user runs btrfs dev scan, which would add
> any btrfs device into the btrfs_fs_devices list.
>
> Now think of a mounted btrfs. And a new device which contains the a SB
> from the mounted btrfs devices.
>
> In this situation when user runs btrfs dev scan, the current code would
> just replace existing device with the new device.
>
> Which is to note that old device is neither closed nor gracefully
> removed from the btrfs.
>
> The FS is still operational with the old bdev however the device name
> is the btrfs_device is new which is provided by the btrfs dev scan.
>
> reproducer:
>
> devmgt[1] detach /dev/sdc
>
> replace the missing disk /dev/sdc
>
> btrfs rep start -f 1 /dev/sde /btrfs
> Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120
>          Total devices 2 FS bytes used 32.00KiB
>          devid    1 size 958.94MiB used 115.88MiB path /dev/sde
>          devid    2 size 958.94MiB used 103.88MiB path /dev/sdd
>
> make /dev/sdc to reappear
>
> devmgt attach host2
>
> btrfs dev scan
>
> btrfs fi show -m
> Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120^M
>          Total devices 2 FS bytes used 32.00KiB^M
>          devid    1 size 958.94MiB used 115.88MiB path /dev/sdc <- Wrong.
>          devid    2 size 958.94MiB used 103.88MiB path /dev/sdd
>
> since /dev/sdc has been replaced with /dev/sde, the /dev/sdc shouldn't be
> part of the btrfs-fsid when it reappears. If user want it to be part of it
> then sys admin should be using btrfs device add instead.
>
> [1] github.com/anajain/devmgt.git
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
Thanks for your patch.

The patch works fine, but unfortunately the solution seems not so generic.

The patch just avoid the dirty work to distinguish devices with same uuid,
This will fix the bug with dev scan, but when you umount the fs and 
mount it again,
it will use the old device again, since device_list_add() still can't 
distinguish different
devices with same dev uuid.

IMO *current* root fix should add the ablity to distinguish different 
deivces even they share
same uuid.
And further more, *long term* fix should be not reusing dev uuid and use 
above fix(ablity to distinguish)
as a fallback.

About the method, I still think Wang's suggestion, using generation to 
distinguish, is good enough for this
bug.
It would be quite kind for you to provide any other ideas.

Thanks,
Qu

> ---
> v2->v3: simpler commit and comment message
> v1->v2: remove '---' in commit messages which conflict with git am
>
>   fs/btrfs/volumes.c | 33 ++++++++++++++++++++++++++++++++-
>   1 file changed, 32 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index bb2cd66..605d9ef 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -472,9 +472,12 @@ static noinline int device_list_add(const char *path,
>   		device = __find_device(&fs_devices->devices, devid,
>   				       disk_super->dev_item.uuid);
>   	}
> +
>   	if (!device) {
> -		if (fs_devices->opened)
> +		if (fs_devices->opened) {
> +			printk(KERN_INFO "BTRFS: device list add failed, FS is open");
>   			return -EBUSY;
> +		}
>   
>   		device = btrfs_alloc_device(NULL, &devid,
>   					    disk_super->dev_item.uuid);
> @@ -497,6 +500,34 @@ static noinline int device_list_add(const char *path,
>   
>   		device->fs_devices = fs_devices;
>   	} else if (!device->name || strcmp(device->name->str, path)) {
> +		/*
> +		 * When FS is already mounted.
> +		 * 1. If you are here and if the device->name is NULL that means
> +		 *    this device was missing at time of FS mount.
> +		 * 2. If you are here and if the device->name is different from 'path'
> +		 *    that means either
> +		 *      a. The same device disappeared and reappeared with different
> +		 *         name. or
> +		 *      b. The missing-disk-which-was-replaced, has reappeared now.
> +		 *
> +		 *  We must allow 1 and 2a above. But 2b would be a spurious and
> +		 *  unintentional.
> +		 *  Further in case of 1 and 2a above, the disk at 'path' would have
> +		 *  missed some transaction when it was away and in case of 2a
> +		 *  the stale bdev has to be updated as well.
> +		 *  2b must not be allowed at all time.
> +		 */
> +
> +		/*
> +		 * As of now don't allow update to btrfs_fs_device through the
> +		 * btrfs dev scan cli, after FS has been mounted.
> +		 */
> +
> +		if (fs_devices->opened) {
> +			printk(KERN_INFO "BTRFS: device list update failed, FS is open");
> +			return -EBUSY;
> +		}
> +
>   		name = rcu_string_strdup(path, GFP_NOFS);
>   		if (!name)
>   			return -ENOMEM;

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain June 9, 2014, 2:19 a.m. UTC | #2
Hi Qu,

> Thanks for your patch.
>
> The patch works fine,



> but unfortunately the solution seems not so generic.
> The patch just avoid the dirty work to distinguish devices with same uuid,
> This will fix the bug with dev scan, but when you umount the fs and
> mount it again, it will use the old device again, since device_list_add()
 > still can't distinguish different devices with same dev uuid.

  As mentioned in the other thread.. we expect user to check the devices
  before / after mount and wipefs the disks which should not belong to
  the fsid. as below.

----------------
  Yes. some challenges to get that based on the generation number.
  too many limitations. and patch created didn't pass all the tests.
  so I didn't send that patch.
  But I was talking about this patch (sorry to confuse you).

    Btrfs: device_list_add() should not update list when mounted

  And as of now when its unmounted we expect user to wipe SB
  of the disk which should not belong to the fsid. which will
  solve the problem as well. but a bit of hard work though.
  (there is a chance to notice the _actual_ disks being used
  after the fs is mounted)
----------------

  Above patch will avoid the serious problem that happens
  when disk reappears (on a SElinux)

  when disk reappears on SElinux the device_list_add() is called
  automatically.  And that will replace the disk of a mounted FS.
  This patch will fix that.





> IMO *current* root fix should add the ablity to distinguish different
> deivces even they share
> same uuid.
> And further more, *long term* fix should be not reusing dev uuid and use
> above fix(ablity to distinguish)
> as a fallback.
>
> About the method, I still think Wang's suggestion, using generation to
> distinguish, is good enough for this
> bug.
> It would be quite kind for you to provide any other ideas.

  generation check does not solve the following test case.
     (missing devices are very common incidence at the data centers).
     - replace a missing disk (replace-source) with replace-target
     - unmount
     - now replace-target disappears
     - replace-source which was missing now reappears
     - mount (mounts with replace-source since replace-target is not
       present and so now the generation is greatest on the
       replace-source)
     - unmount
     - now all disks reappears
     - mount would pick replace-source instead of replace-target


  As you mention replace should not clone uuid is the best approach
  for all these issues. I had the same idea when I first notice this
  issue.

  But as of now the above workaround patch is simple and safe.


  Let me know. your thoughts.

Thanks , Anand


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo June 9, 2014, 2:35 a.m. UTC | #3
-------- Original Message --------
Subject: Re: [PATCH V3] Btrfs: device_list_add() should not update list 
when mounted
From: Anand Jain <Anand.Jain@oracle.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>, linux-btrfs@vger.kernel.org
Date: 2014?06?09? 10:19
>
> Hi Qu,
>
>> Thanks for your patch.
>>
>> The patch works fine,
>
>
>
>> but unfortunately the solution seems not so generic.
>> The patch just avoid the dirty work to distinguish devices with same 
>> uuid,
>> This will fix the bug with dev scan, but when you umount the fs and
>> mount it again, it will use the old device again, since 
>> device_list_add()
> > still can't distinguish different devices with same dev uuid.
>
>  As mentioned in the other thread.. we expect user to check the devices
>  before / after mount and wipefs the disks which should not belong to
>  the fsid. as below.
>
> ----------------
>  Yes. some challenges to get that based on the generation number.
>  too many limitations. and patch created didn't pass all the tests.
>  so I didn't send that patch.
>  But I was talking about this patch (sorry to confuse you).
>
>    Btrfs: device_list_add() should not update list when mounted
>
>  And as of now when its unmounted we expect user to wipe SB
>  of the disk which should not belong to the fsid. which will
>  solve the problem as well. but a bit of hard work though.
>  (there is a chance to notice the _actual_ disks being used
>  after the fs is mounted)
> ----------------
>
>  Above patch will avoid the serious problem that happens
>  when disk reappears (on a SElinux)
>
>  when disk reappears on SElinux the device_list_add() is called
>  automatically.  And that will replace the disk of a mounted FS.
>  This patch will fix that.
>
>
>
>
>
>> IMO *current* root fix should add the ablity to distinguish different
>> deivces even they share
>> same uuid.
>> And further more, *long term* fix should be not reusing dev uuid and use
>> above fix(ablity to distinguish)
>> as a fallback.
>>
>> About the method, I still think Wang's suggestion, using generation to
>> distinguish, is good enough for this
>> bug.
>> It would be quite kind for you to provide any other ideas.
>
>  generation check does not solve the following test case.
>     (missing devices are very common incidence at the data centers).
>     - replace a missing disk (replace-source) with replace-target
>     - unmount
>     - now replace-target disappears
>     - replace-source which was missing now reappears
>     - mount (mounts with replace-source since replace-target is not
>       present and so now the generation is greatest on the
>       replace-source)
>     - unmount
>     - now all disks reappears
>     - mount would pick replace-source instead of replace-target
>
>
>  As you mention replace should not clone uuid is the best approach
>  for all these issues. I had the same idea when I first notice this
>  issue.
>
>  But as of now the above workaround patch is simple and safe.
>
>
>  Let me know. your thoughts.
>
> Thanks , Anand
>
>
Thanks for your reply.

Did you tried the following test case?
- mount with degraded mode (missing device is called A)
- replace device A with device B.
- umount fs
- device A reappeared and reboot the system
- mount fs again.

I think the newly mounted fs will still use device A not device B with 
your patch,
even though device B has a larger generation.

For the case you mentioned, I think the behavior is OK,
always use the device with *current* largest generation is a acceptable
strategy, since the only things we can depend on the devices we 
*current* see...

P.S. Even btrfs changes to not duplicate dev uuid, for backward 
compatibility,
we still need to face the problem...

Thanks,
Qu.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain June 9, 2014, 3:29 a.m. UTC | #4
>>  As mentioned in the other thread.. we expect user to check the devices
>>  before / after mount and wipefs the disks which should not belong to
>>  the fsid.

  my bad. I just realized that unmount and wipefs may not be an
  viable choice in case of btrfs as root fs.

> For the case you mentioned, I think the behavior is OK,
> always use the device with *current* largest generation is a acceptable
> strategy, since the only things we can depend on the devices we
> *current* see...

  Yeah. let me do that approach as well to take care of picking
  the correct disk when FS is _unmounted_.

  And, when FS is mounted we should NOT kick out device paths
  invariably. which means we still need the patch.

    Btrfs: device_list_add() should not update list when mounted

Thanks, Anand


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo June 10, 2014, 1:25 a.m. UTC | #5
-------- Original Message --------
Subject: [PATCH V3] Btrfs: device_list_add() should not update list when 
mounted
From: Anand Jain <anand.jain@oracle.com>
To: linux-btrfs@vger.kernel.org
Date: 2014?06?06? 11:26
> (looks like there was some sendmail problem I don't see this in the btrfs list,
> sending again. sorry for multiple copies if any).
>
> device_list_add() is called when user runs btrfs dev scan, which would add
> any btrfs device into the btrfs_fs_devices list.
>
> Now think of a mounted btrfs. And a new device which contains the a SB
> from the mounted btrfs devices.
>
> In this situation when user runs btrfs dev scan, the current code would
> just replace existing device with the new device.
>
> Which is to note that old device is neither closed nor gracefully
> removed from the btrfs.
>
> The FS is still operational with the old bdev however the device name
> is the btrfs_device is new which is provided by the btrfs dev scan.
>
> reproducer:
>
> devmgt[1] detach /dev/sdc
>
> replace the missing disk /dev/sdc
>
> btrfs rep start -f 1 /dev/sde /btrfs
> Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120
>          Total devices 2 FS bytes used 32.00KiB
>          devid    1 size 958.94MiB used 115.88MiB path /dev/sde
>          devid    2 size 958.94MiB used 103.88MiB path /dev/sdd
>
> make /dev/sdc to reappear
>
> devmgt attach host2
>
> btrfs dev scan
>
> btrfs fi show -m
> Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120^M
>          Total devices 2 FS bytes used 32.00KiB^M
>          devid    1 size 958.94MiB used 115.88MiB path /dev/sdc <- Wrong.
>          devid    2 size 958.94MiB used 103.88MiB path /dev/sdd
>
> since /dev/sdc has been replaced with /dev/sde, the /dev/sdc shouldn't be
> part of the btrfs-fsid when it reappears. If user want it to be part of it
> then sys admin should be using btrfs device add instead.
>
> [1] github.com/anajain/devmgt.git
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> v2->v3: simpler commit and comment message
> v1->v2: remove '---' in commit messages which conflict with git am
>
>   fs/btrfs/volumes.c | 33 ++++++++++++++++++++++++++++++++-
>   1 file changed, 32 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index bb2cd66..605d9ef 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -472,9 +472,12 @@ static noinline int device_list_add(const char *path,
>   		device = __find_device(&fs_devices->devices, devid,
>   				       disk_super->dev_item.uuid);
>   	}
> +
>   	if (!device) {
> -		if (fs_devices->opened)
> +		if (fs_devices->opened) {
> +			printk(KERN_INFO "BTRFS: device list add failed, FS is open");
>   			return -EBUSY;
> +		}
>   
>   		device = btrfs_alloc_device(NULL, &devid,
>   					    disk_super->dev_item.uuid);
> @@ -497,6 +500,34 @@ static noinline int device_list_add(const char *path,
>   
>   		device->fs_devices = fs_devices;
>   	} else if (!device->name || strcmp(device->name->str, path)) {
> +		/*
> +		 * When FS is already mounted.
> +		 * 1. If you are here and if the device->name is NULL that means
> +		 *    this device was missing at time of FS mount.
> +		 * 2. If you are here and if the device->name is different from 'path'
> +		 *    that means either
> +		 *      a. The same device disappeared and reappeared with different
> +		 *         name. or
> +		 *      b. The missing-disk-which-was-replaced, has reappeared now.
> +		 *
> +		 *  We must allow 1 and 2a above. But 2b would be a spurious and
> +		 *  unintentional.
> +		 *  Further in case of 1 and 2a above, the disk at 'path' would have
> +		 *  missed some transaction when it was away and in case of 2a
> +		 *  the stale bdev has to be updated as well.
> +		 *  2b must not be allowed at all time.
> +		 */
> +
> +		/*
> +		 * As of now don't allow update to btrfs_fs_device through the
> +		 * btrfs dev scan cli, after FS has been mounted.
> +		 */
> +
> +		if (fs_devices->opened) {
> +			printk(KERN_INFO "BTRFS: device list update failed, FS is open");
> +			return -EBUSY;
> +		}
> +
The 'if(fs_devices->opened)' block is in both branch of the 
'if(!device)' judgement,
it would be better to extract the code block out of the 'if(!device)' block.

Thanks,
Qu
>   		name = rcu_string_strdup(path, GFP_NOFS);
>   		if (!name)
>   			return -ENOMEM;

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain June 10, 2014, 1:48 a.m. UTC | #6
On 10/06/14 09:25, Qu Wenruo wrote:
>
> -------- Original Message --------
> Subject: [PATCH V3] Btrfs: device_list_add() should not update list when
> mounted
> From: Anand Jain <anand.jain@oracle.com>
> To: linux-btrfs@vger.kernel.org
> Date: 2014?06?06? 11:26
>> (looks like there was some sendmail problem I don't see this in the
>> btrfs list,
>> sending again. sorry for multiple copies if any).
>>
>> device_list_add() is called when user runs btrfs dev scan, which would
>> add
>> any btrfs device into the btrfs_fs_devices list.
>>
>> Now think of a mounted btrfs. And a new device which contains the a SB
>> from the mounted btrfs devices.
>>
>> In this situation when user runs btrfs dev scan, the current code would
>> just replace existing device with the new device.
>>
>> Which is to note that old device is neither closed nor gracefully
>> removed from the btrfs.
>>
>> The FS is still operational with the old bdev however the device name
>> is the btrfs_device is new which is provided by the btrfs dev scan.
>>
>> reproducer:
>>
>> devmgt[1] detach /dev/sdc
>>
>> replace the missing disk /dev/sdc
>>
>> btrfs rep start -f 1 /dev/sde /btrfs
>> Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120
>>          Total devices 2 FS bytes used 32.00KiB
>>          devid    1 size 958.94MiB used 115.88MiB path /dev/sde
>>          devid    2 size 958.94MiB used 103.88MiB path /dev/sdd
>>
>> make /dev/sdc to reappear
>>
>> devmgt attach host2
>>
>> btrfs dev scan
>>
>> btrfs fi show -m
>> Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120^M
>>          Total devices 2 FS bytes used 32.00KiB^M
>>          devid    1 size 958.94MiB used 115.88MiB path /dev/sdc <- Wrong.
>>          devid    2 size 958.94MiB used 103.88MiB path /dev/sdd
>>
>> since /dev/sdc has been replaced with /dev/sde, the /dev/sdc shouldn't be
>> part of the btrfs-fsid when it reappears. If user want it to be part
>> of it
>> then sys admin should be using btrfs device add instead.
>>
>> [1] github.com/anajain/devmgt.git
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>> v2->v3: simpler commit and comment message
>> v1->v2: remove '---' in commit messages which conflict with git am
>>
>>   fs/btrfs/volumes.c | 33 ++++++++++++++++++++++++++++++++-
>>   1 file changed, 32 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index bb2cd66..605d9ef 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -472,9 +472,12 @@ static noinline int device_list_add(const char
>> *path,
>>           device = __find_device(&fs_devices->devices, devid,
>>                          disk_super->dev_item.uuid);
>>       }
>> +
>>       if (!device) {
>> -        if (fs_devices->opened)
>> +        if (fs_devices->opened) {
>> +            printk(KERN_INFO "BTRFS: device list add failed, FS is
>> open");
>>               return -EBUSY;
>> +        }
>>           device = btrfs_alloc_device(NULL, &devid,
>>                           disk_super->dev_item.uuid);
>> @@ -497,6 +500,34 @@ static noinline int device_list_add(const char
>> *path,
>>           device->fs_devices = fs_devices;
>>       } else if (!device->name || strcmp(device->name->str, path)) {
>> +        /*
>> +         * When FS is already mounted.
>> +         * 1. If you are here and if the device->name is NULL that means
>> +         *    this device was missing at time of FS mount.
>> +         * 2. If you are here and if the device->name is different
>> from 'path'
>> +         *    that means either
>> +         *      a. The same device disappeared and reappeared with
>> different
>> +         *         name. or
>> +         *      b. The missing-disk-which-was-replaced, has
>> reappeared now.
>> +         *
>> +         *  We must allow 1 and 2a above. But 2b would be a spurious and
>> +         *  unintentional.
>> +         *  Further in case of 1 and 2a above, the disk at 'path'
>> would have
>> +         *  missed some transaction when it was away and in case of 2a
>> +         *  the stale bdev has to be updated as well.
>> +         *  2b must not be allowed at all time.
>> +         */
>> +
>> +        /*
>> +         * As of now don't allow update to btrfs_fs_device through the
>> +         * btrfs dev scan cli, after FS has been mounted.
>> +         */
>> +
>> +        if (fs_devices->opened) {
>> +            printk(KERN_INFO "BTRFS: device list update failed, FS is
>> open");
>> +            return -EBUSY;
>> +        }
>> +
> The 'if(fs_devices->opened)' block is in both branch of the
> 'if(!device)' judgement,
> it would be better to extract the code block out of the 'if(!device)'
> block.

  thanks for the comments Qu.

   we have else if. that is when device is found and path is NOT new
   (matches with the old) then we shall return success.

Anand

> Thanks,
> Qu
>>           name = rcu_string_strdup(path, GFP_NOFS);
>>           if (!name)
>>               return -ENOMEM;
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo June 10, 2014, 1:50 a.m. UTC | #7
-------- Original Message --------
Subject: Re: [PATCH V3] Btrfs: device_list_add() should not update list 
when mounted
From: Anand Jain <Anand.Jain@oracle.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>, linux-btrfs@vger.kernel.org
Date: 2014?06?10? 09:48
>
>
> On 10/06/14 09:25, Qu Wenruo wrote:
>>
>> -------- Original Message --------
>> Subject: [PATCH V3] Btrfs: device_list_add() should not update list when
>> mounted
>> From: Anand Jain <anand.jain@oracle.com>
>> To: linux-btrfs@vger.kernel.org
>> Date: 2014?06?06? 11:26
>>> (looks like there was some sendmail problem I don't see this in the
>>> btrfs list,
>>> sending again. sorry for multiple copies if any).
>>>
>>> device_list_add() is called when user runs btrfs dev scan, which would
>>> add
>>> any btrfs device into the btrfs_fs_devices list.
>>>
>>> Now think of a mounted btrfs. And a new device which contains the a SB
>>> from the mounted btrfs devices.
>>>
>>> In this situation when user runs btrfs dev scan, the current code would
>>> just replace existing device with the new device.
>>>
>>> Which is to note that old device is neither closed nor gracefully
>>> removed from the btrfs.
>>>
>>> The FS is still operational with the old bdev however the device name
>>> is the btrfs_device is new which is provided by the btrfs dev scan.
>>>
>>> reproducer:
>>>
>>> devmgt[1] detach /dev/sdc
>>>
>>> replace the missing disk /dev/sdc
>>>
>>> btrfs rep start -f 1 /dev/sde /btrfs
>>> Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120
>>>          Total devices 2 FS bytes used 32.00KiB
>>>          devid    1 size 958.94MiB used 115.88MiB path /dev/sde
>>>          devid    2 size 958.94MiB used 103.88MiB path /dev/sdd
>>>
>>> make /dev/sdc to reappear
>>>
>>> devmgt attach host2
>>>
>>> btrfs dev scan
>>>
>>> btrfs fi show -m
>>> Label: none  uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120^M
>>>          Total devices 2 FS bytes used 32.00KiB^M
>>>          devid    1 size 958.94MiB used 115.88MiB path /dev/sdc <- 
>>> Wrong.
>>>          devid    2 size 958.94MiB used 103.88MiB path /dev/sdd
>>>
>>> since /dev/sdc has been replaced with /dev/sde, the /dev/sdc 
>>> shouldn't be
>>> part of the btrfs-fsid when it reappears. If user want it to be part
>>> of it
>>> then sys admin should be using btrfs device add instead.
>>>
>>> [1] github.com/anajain/devmgt.git
>>>
>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>> ---
>>> v2->v3: simpler commit and comment message
>>> v1->v2: remove '---' in commit messages which conflict with git am
>>>
>>>   fs/btrfs/volumes.c | 33 ++++++++++++++++++++++++++++++++-
>>>   1 file changed, 32 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index bb2cd66..605d9ef 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -472,9 +472,12 @@ static noinline int device_list_add(const char
>>> *path,
>>>           device = __find_device(&fs_devices->devices, devid,
>>>                          disk_super->dev_item.uuid);
>>>       }
>>> +
>>>       if (!device) {
>>> -        if (fs_devices->opened)
>>> +        if (fs_devices->opened) {
>>> +            printk(KERN_INFO "BTRFS: device list add failed, FS is
>>> open");
>>>               return -EBUSY;
>>> +        }
>>>           device = btrfs_alloc_device(NULL, &devid,
>>>                           disk_super->dev_item.uuid);
>>> @@ -497,6 +500,34 @@ static noinline int device_list_add(const char
>>> *path,
>>>           device->fs_devices = fs_devices;
>>>       } else if (!device->name || strcmp(device->name->str, path)) {
>>> +        /*
>>> +         * When FS is already mounted.
>>> +         * 1. If you are here and if the device->name is NULL that 
>>> means
>>> +         *    this device was missing at time of FS mount.
>>> +         * 2. If you are here and if the device->name is different
>>> from 'path'
>>> +         *    that means either
>>> +         *      a. The same device disappeared and reappeared with
>>> different
>>> +         *         name. or
>>> +         *      b. The missing-disk-which-was-replaced, has
>>> reappeared now.
>>> +         *
>>> +         *  We must allow 1 and 2a above. But 2b would be a 
>>> spurious and
>>> +         *  unintentional.
>>> +         *  Further in case of 1 and 2a above, the disk at 'path'
>>> would have
>>> +         *  missed some transaction when it was away and in case of 2a
>>> +         *  the stale bdev has to be updated as well.
>>> +         *  2b must not be allowed at all time.
>>> +         */
>>> +
>>> +        /*
>>> +         * As of now don't allow update to btrfs_fs_device through the
>>> +         * btrfs dev scan cli, after FS has been mounted.
>>> +         */
>>> +
>>> +        if (fs_devices->opened) {
>>> +            printk(KERN_INFO "BTRFS: device list update failed, FS is
>>> open");
>>> +            return -EBUSY;
>>> +        }
>>> +
>> The 'if(fs_devices->opened)' block is in both branch of the
>> 'if(!device)' judgement,
>> it would be better to extract the code block out of the 'if(!device)'
>> block.
>
>  thanks for the comments Qu.
>
>   we have else if. that is when device is found and path is NOT new
>   (matches with the old) then we shall return success.
>
> Anand
Oh, my bad. Forgot the code can continue without hitting either branch.

Thanks,
Qu
>
>> Thanks,
>> Qu
>>>           name = rcu_string_strdup(path, GFP_NOFS);
>>>           if (!name)
>>>               return -ENOMEM;
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe 
>> linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index bb2cd66..605d9ef 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -472,9 +472,12 @@  static noinline int device_list_add(const char *path,
 		device = __find_device(&fs_devices->devices, devid,
 				       disk_super->dev_item.uuid);
 	}
+
 	if (!device) {
-		if (fs_devices->opened)
+		if (fs_devices->opened) {
+			printk(KERN_INFO "BTRFS: device list add failed, FS is open");
 			return -EBUSY;
+		}
 
 		device = btrfs_alloc_device(NULL, &devid,
 					    disk_super->dev_item.uuid);
@@ -497,6 +500,34 @@  static noinline int device_list_add(const char *path,
 
 		device->fs_devices = fs_devices;
 	} else if (!device->name || strcmp(device->name->str, path)) {
+		/*
+		 * When FS is already mounted.
+		 * 1. If you are here and if the device->name is NULL that means
+		 *    this device was missing at time of FS mount.
+		 * 2. If you are here and if the device->name is different from 'path'
+		 *    that means either
+		 *      a. The same device disappeared and reappeared with different
+		 *         name. or
+		 *      b. The missing-disk-which-was-replaced, has reappeared now.
+		 *
+		 *  We must allow 1 and 2a above. But 2b would be a spurious and
+		 *  unintentional.
+		 *  Further in case of 1 and 2a above, the disk at 'path' would have
+		 *  missed some transaction when it was away and in case of 2a
+		 *  the stale bdev has to be updated as well.
+		 *  2b must not be allowed at all time.
+		 */
+
+		/*
+		 * As of now don't allow update to btrfs_fs_device through the
+		 * btrfs dev scan cli, after FS has been mounted.
+		 */
+
+		if (fs_devices->opened) {
+			printk(KERN_INFO "BTRFS: device list update failed, FS is open");
+			return -EBUSY;
+		}
+
 		name = rcu_string_strdup(path, GFP_NOFS);
 		if (!name)
 			return -ENOMEM;