diff mbox

[1/2] btrfs-progs: Add missing devices check for mounted btrfs.

Message ID 1391755560-4721-1-git-send-email-quwenruo@cn.fujitsu.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Qu Wenruo Feb. 7, 2014, 6:45 a.m. UTC
In btrfs/003 of xfstest, it will check whether btrfs fi show can find
missing devices.

But before the patch, btrfs-progs will not check whether device missing
if given a mounted btrfs mountpoint/block device.
This patch fixes the bug and will pass btrfs/003.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Cc: Anand Jain <anand.jain@oracle.com>
---
 cmds-filesystem.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Anand Jain Feb. 7, 2014, 9:34 a.m. UTC | #1
IMO btrfs-progs shouldn't add its intelligence to know if disk
  is missing. If btrfs-kernel doesn't know when disk is missing
  that's a bug to fix in btrfs-kernel. yes that indeed true as
  of now in btrfs-kernel. btrfs kernel has no idea when disk
  goes missing, just -EIO doesn't tell btrfs that. I am trying
  to fix this first.

  But the problem is there isn't good way with in btrfs/FS
  to know when disk goes missing. did I miss anything ?


Thanks, Anand


On 02/07/2014 02:45 PM, Qu Wenruo wrote:
> In btrfs/003 of xfstest, it will check whether btrfs fi show can find
> missing devices.
>
> But before the patch, btrfs-progs will not check whether device missing
> if given a mounted btrfs mountpoint/block device.
> This patch fixes the bug and will pass btrfs/003.
>
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
> Cc: Anand Jain <anand.jain@oracle.com>
> ---
>   cmds-filesystem.c | 12 ++++++++++++
>   1 file changed, 12 insertions(+)
>
> diff --git a/cmds-filesystem.c b/cmds-filesystem.c
> index 384d1b9..4c9933d 100644
> --- a/cmds-filesystem.c
> +++ b/cmds-filesystem.c
> @@ -363,6 +363,8 @@ static int print_one_fs(struct btrfs_ioctl_fs_info_args *fs_info,
>   		char *label, char *path)
>   {
>   	int i;
> +	int fd;
> +	int missing;
>   	char uuidbuf[BTRFS_UUID_UNPARSED_SIZE];
>   	struct btrfs_ioctl_dev_info_args *tmp_dev_info;
>   	int ret;
> @@ -385,6 +387,14 @@ static int print_one_fs(struct btrfs_ioctl_fs_info_args *fs_info,
>
>   	for (i = 0; i < fs_info->num_devices; i++) {
>   		tmp_dev_info = (struct btrfs_ioctl_dev_info_args *)&dev_info[i];
> +
> +		/* Add check for missing devices even mounted */
> +		fd = open((char *)tmp_dev_info->path, O_RDONLY);
> +		if (fd < 0) {
> +			missing = 1;
> +			continue;
> +		}
> +		close(fd);
>   		printf("\tdevid %4llu size %s used %s path %s\n",
>   			tmp_dev_info->devid,
>   			pretty_size(tmp_dev_info->total_bytes),
> @@ -392,6 +402,8 @@ static int print_one_fs(struct btrfs_ioctl_fs_info_args *fs_info,
>   			tmp_dev_info->path);
>   	}
>
> +	if (missing)
> +		printf("\t*** Some devices missing\n");
>   	printf("\n");
>   	return 0;
>   }
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo Feb. 10, 2014, 12:36 a.m. UTC | #2
On Fri, 07 Feb 2014 17:34:46 +0800, Anand Jain wrote:
>
>
>  IMO btrfs-progs shouldn't add its intelligence to know if disk
>  is missing. If btrfs-kernel doesn't know when disk is missing
>  that's a bug to fix in btrfs-kernel. yes that indeed true as
>  of now in btrfs-kernel. btrfs kernel has no idea when disk
>  goes missing, just -EIO doesn't tell btrfs that. I am trying
>  to fix this first.
>
>  But the problem is there isn't good way with in btrfs/FS
>  to know when disk goes missing. did I miss anything ?
Yes, kernel detection is the best way.
But since it has no better way to detect missing device, I think the 
btrfs-progs way fix is good enough for now.

Since btrfs fi show with "-d" options will scan the /dev to find fs and 
check missing disks,
I think adds some user-land check even using the ioctl way is still 
somewhat reasonable.

Thanks
Qu
>
>
> Thanks, Anand
>
>
> On 02/07/2014 02:45 PM, Qu Wenruo wrote:
>> In btrfs/003 of xfstest, it will check whether btrfs fi show can find
>> missing devices.
>>
>> But before the patch, btrfs-progs will not check whether device missing
>> if given a mounted btrfs mountpoint/block device.
>> This patch fixes the bug and will pass btrfs/003.
>>
>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> Cc: Anand Jain <anand.jain@oracle.com>
>> ---
>>   cmds-filesystem.c | 12 ++++++++++++
>>   1 file changed, 12 insertions(+)
>>
>> diff --git a/cmds-filesystem.c b/cmds-filesystem.c
>> index 384d1b9..4c9933d 100644
>> --- a/cmds-filesystem.c
>> +++ b/cmds-filesystem.c
>> @@ -363,6 +363,8 @@ static int print_one_fs(struct 
>> btrfs_ioctl_fs_info_args *fs_info,
>>           char *label, char *path)
>>   {
>>       int i;
>> +    int fd;
>> +    int missing;
>>       char uuidbuf[BTRFS_UUID_UNPARSED_SIZE];
>>       struct btrfs_ioctl_dev_info_args *tmp_dev_info;
>>       int ret;
>> @@ -385,6 +387,14 @@ static int print_one_fs(struct 
>> btrfs_ioctl_fs_info_args *fs_info,
>>
>>       for (i = 0; i < fs_info->num_devices; i++) {
>>           tmp_dev_info = (struct btrfs_ioctl_dev_info_args 
>> *)&dev_info[i];
>> +
>> +        /* Add check for missing devices even mounted */
>> +        fd = open((char *)tmp_dev_info->path, O_RDONLY);
>> +        if (fd < 0) {
>> +            missing = 1;
>> +            continue;
>> +        }
>> +        close(fd);
>>           printf("\tdevid %4llu size %s used %s path %s\n",
>>               tmp_dev_info->devid,
>>               pretty_size(tmp_dev_info->total_bytes),
>> @@ -392,6 +402,8 @@ static int print_one_fs(struct 
>> btrfs_ioctl_fs_info_args *fs_info,
>>               tmp_dev_info->path);
>>       }
>>
>> +    if (missing)
>> +        printf("\t*** Some devices missing\n");
>>       printf("\n");
>>       return 0;
>>   }
>>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain April 9, 2014, 3:04 a.m. UTC | #3
Below shows the bug cascading to this patch.

  And now to fix this I think we shouldn't fix/workaround in the
  btrfs-progs again!, fix it in the btrfs-kernel (or leave it open
  until suitable fix is found, I tried and failed. but don't fix it
  in a wrong way). If you want to help to fix this problem: Find out
  if we could get kobject notification with in kernel when disks gets
  disappeared.

  I have been advocating btrfs-progs should _not_ add its intelligence
  and it should be as transparent as possible in showing the kernel's
  status. This should be seriously considered.

(-----------
  For patches to take this approach the core problem here is different
  and hope we could correct it..
  First, we have a superficial and wrong measuring tape (xfstest) and
  we are trying to fix the product using it And in between is btrfs-progs
  which is trying to add more superficial-ness.
  2nd, btrfs Wiki has a theory and thus sets the direction that
  btrfs-progs would copy code from btrfs-kernel, I seriously doubt
  if that's a good idea.
  If you want to make btrfs-progs as intelligent as btrfs-kernel
  (which I don't understand why you should ?  since the purpose of
  btrfs-progs and btrfs-kernel are different) then first you need
  develop a mini synchronization mechanism between btrfs-progs and btrfs
  kernel which is as good as two active nodes FS which says from my
  experience with Solaris/SAM-QFS. Developing a synchronization
  mechanism is not in the plan here. Further from the End user
  Application (DB) performance perspective calling sync at the need of
  something like btrfs-progs is a very very bad idea. Applications would
  experience jitters in their steady state performance. Once Solaris had
  this issue and we fixed it.
-----------)

  Have fun. ;-)

----------------------------------------------------------------
$ btrfs dev scan
Scanning for Btrfs filesystems
$ mount /dev/sdc /btrfs
$ btrfs fi show
Label: none  uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
	Total devices 2 FS bytes used 663.81MiB
	devid    1 size 1.10GiB used 1.10GiB path /dev/sdf
	devid    2 size 1.10GiB used 1.08GiB path /dev/sdc

$ devmgt show
host0 sda
host1 sdf
host2 sdc
host3 sdd
host4 sde
$ devmgt detach /dev/sdf
-----/dev/kmsg----
	sd 1:0:0:0: [sdf] Stopping disk
	 SUBSYSTEM=scsi
	 DEVICE=+scsi:1:0:0:0
	ata2.00: disabled
------------------
detach /dev/sdf successful

(as a known bug btrfs kernel does not know device is missing, missing 
flag isn't set, as shown below)

$ btrfs-devlist
fsid name uuid (seed_fsid sprout_fsid)
	(fs_latest_devid fs_num_devices fs_open_devices fs_rw_devices 
fs_missing_devices fs_total_devices) fs_total_rw_bytes 
fs_num_can_discard fs_latest_trans
	devid gen total_bytes disk_total_bytes bytes_used type io_align 
io_width sector_size fmode
	fs_flags
	dev_flags

dfbf136d-e8d2-489b-8ee1-be0d5999769d /dev/sdf 
13715cc5-3aeb-4523-b02c-a072fd427a00 (null null)
	(2 2 2 2 0 2) 2363490304 0 7
	1 5 1181745152 1181745152 1181745152 0 4096 4096 4096 0x83
	fs_Mounted|not_fs_Seeding|fs_Rotating
	Writable|MD|not_Missing|not_Discard|not_Replace_tgt|not_Run_pending|not_Nobarriers|Stat_valid|Stat_dirty|Bdev

dfbf136d-e8d2-489b-8ee1-be0d5999769d /dev/sdc 
12ad34f7-8d58-44fa-95cf-b2bbc0cec69d (null null)
	(2 2 2 2 0 2) 2363490304 0 7
	2 7 1181745152 1181745152 1160773632 0 4096 4096 4096 0x83
	fs_Mounted|not_fs_Seeding|fs_Rotating
	Writable|MD|not_Missing|not_Discard|not_Replace_tgt|not_Run_pending|not_Nobarriers|Stat_valid|Stat_dirty|Bdev


(below btrfs-progs patch added intelligence to tell the world that 
device is missing)

Ref:
~~~~~~~
commit 2ae6a037efd52ae0fa30052d456ad07f074f5d54
Author: Qu Wenruo <quwenruo@cn.fujitsu.com>
Date:   Fri Feb 7 15:07:19 2014 +0800

     btrfs-progs: Add missing devices check for mounted btrfs.
~~~~~~~

$ btrfs fi show
Label: none  uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
	Total devices 2 FS bytes used 663.81MiB
	devid    2 size 1.10GiB used 1.08GiB path /dev/sdc
	*** Some devices missing

$ btrfs dev add /dev/sde /btrfs
$ btrfs fi show
Label: none  uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
	Total devices 3 FS bytes used 663.81MiB
	devid    2 size 1.10GiB used 1.08GiB path /dev/sdc
	devid    3 size 1.04GiB used 0.00 path /dev/sde
	*** Some devices missing


Now the bug is delete missing fails. Since kernel don't
understand whats missing.

$ btrfs dev del missing /btrfs
ERROR: error removing the device 'missing' - no missing devices found to 
remove
$
---------------------------------------------------------------------------





On 02/10/2014 08:36 AM, Qu Wenruo wrote:
> On Fri, 07 Feb 2014 17:34:46 +0800, Anand Jain wrote:
>>
>>
>>  IMO btrfs-progs shouldn't add its intelligence to know if disk
>>  is missing. If btrfs-kernel doesn't know when disk is missing
>>  that's a bug to fix in btrfs-kernel. yes that indeed true as
>>  of now in btrfs-kernel. btrfs kernel has no idea when disk
>>  goes missing, just -EIO doesn't tell btrfs that. I am trying
>>  to fix this first.
>>
>>  But the problem is there isn't good way with in btrfs/FS
>>  to know when disk goes missing. did I miss anything ?
> Yes, kernel detection is the best way.
> But since it has no better way to detect missing device, I think the
> btrfs-progs way fix is good enough for now.
>
> Since btrfs fi show with "-d" options will scan the /dev to find fs and
> check missing disks,
> I think adds some user-land check even using the ioctl way is still
> somewhat reasonable.
>
> Thanks
> Qu
>>
>>
>> Thanks, Anand
>>
>>
>> On 02/07/2014 02:45 PM, Qu Wenruo wrote:
>>> In btrfs/003 of xfstest, it will check whether btrfs fi show can find
>>> missing devices.
>>>
>>> But before the patch, btrfs-progs will not check whether device missing
>>> if given a mounted btrfs mountpoint/block device.
>>> This patch fixes the bug and will pass btrfs/003.
>>>
>>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>> Cc: Anand Jain <anand.jain@oracle.com>
>>> ---
>>>   cmds-filesystem.c | 12 ++++++++++++
>>>   1 file changed, 12 insertions(+)
>>>
>>> diff --git a/cmds-filesystem.c b/cmds-filesystem.c
>>> index 384d1b9..4c9933d 100644
>>> --- a/cmds-filesystem.c
>>> +++ b/cmds-filesystem.c
>>> @@ -363,6 +363,8 @@ static int print_one_fs(struct
>>> btrfs_ioctl_fs_info_args *fs_info,
>>>           char *label, char *path)
>>>   {
>>>       int i;
>>> +    int fd;
>>> +    int missing;
>>>       char uuidbuf[BTRFS_UUID_UNPARSED_SIZE];
>>>       struct btrfs_ioctl_dev_info_args *tmp_dev_info;
>>>       int ret;
>>> @@ -385,6 +387,14 @@ static int print_one_fs(struct
>>> btrfs_ioctl_fs_info_args *fs_info,
>>>
>>>       for (i = 0; i < fs_info->num_devices; i++) {
>>>           tmp_dev_info = (struct btrfs_ioctl_dev_info_args
>>> *)&dev_info[i];
>>> +
>>> +        /* Add check for missing devices even mounted */
>>> +        fd = open((char *)tmp_dev_info->path, O_RDONLY);
>>> +        if (fd < 0) {
>>> +            missing = 1;
>>> +            continue;
>>> +        }
>>> +        close(fd);
>>>           printf("\tdevid %4llu size %s used %s path %s\n",
>>>               tmp_dev_info->devid,
>>>               pretty_size(tmp_dev_info->total_bytes),
>>> @@ -392,6 +402,8 @@ static int print_one_fs(struct
>>> btrfs_ioctl_fs_info_args *fs_info,
>>>               tmp_dev_info->path);
>>>       }
>>>
>>> +    if (missing)
>>> +        printf("\t*** Some devices missing\n");
>>>       printf("\n");
>>>       return 0;
>>>   }
>>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo April 9, 2014, 3:26 a.m. UTC | #4
Yes, the deleted device scan is still one of the deep problems yet.

But my patch is not intented to deal anything related to the problem.

For me, I am *only* going to deal with the *exit code* problem,
'btrfs fi show <device>' executes correctly(OK, only part of it exactlly),
but exit code is still 1, which is the bug I'm trying to fix it.

IMO, the users/admins may never be interested in the inside mechanism 
nor algorithm,
but the output and exit value things.

This bug is much like the previous 'btrfs fi show' bugs that breaks some 
xfstests test case
(always showing a error due to the scan_kernel_v2 function which calls a 
unimplemented ioctl interface),
if some patch breaks the old exit code or output, then it should be 
fixed to maintain the old
output/exit code (except some big decision is made to change it).

So for the output/exit code consistence, it should be fixed even the 
patch may not  means a cure but
only a workaround for you.

Thanks,
Qu.

? 2014?04?09? 11:04, Anand Jain ??:
>
>
>  Below shows the bug cascading to this patch.
>
>  And now to fix this I think we shouldn't fix/workaround in the
>  btrfs-progs again!, fix it in the btrfs-kernel (or leave it open
>  until suitable fix is found, I tried and failed. but don't fix it
>  in a wrong way). If you want to help to fix this problem: Find out
>  if we could get kobject notification with in kernel when disks gets
>  disappeared.
>
>  I have been advocating btrfs-progs should _not_ add its intelligence
>  and it should be as transparent as possible in showing the kernel's
>  status. This should be seriously considered.
>
> (-----------
>  For patches to take this approach the core problem here is different
>  and hope we could correct it..
>  First, we have a superficial and wrong measuring tape (xfstest) and
>  we are trying to fix the product using it And in between is btrfs-progs
>  which is trying to add more superficial-ness.
>  2nd, btrfs Wiki has a theory and thus sets the direction that
>  btrfs-progs would copy code from btrfs-kernel, I seriously doubt
>  if that's a good idea.
>  If you want to make btrfs-progs as intelligent as btrfs-kernel
>  (which I don't understand why you should ?  since the purpose of
>  btrfs-progs and btrfs-kernel are different) then first you need
>  develop a mini synchronization mechanism between btrfs-progs and btrfs
>  kernel which is as good as two active nodes FS which says from my
>  experience with Solaris/SAM-QFS. Developing a synchronization
>  mechanism is not in the plan here. Further from the End user
>  Application (DB) performance perspective calling sync at the need of
>  something like btrfs-progs is a very very bad idea. Applications would
>  experience jitters in their steady state performance. Once Solaris had
>  this issue and we fixed it.
> -----------)
>
>  Have fun. ;-)
>
> ----------------------------------------------------------------
> $ btrfs dev scan
> Scanning for Btrfs filesystems
> $ mount /dev/sdc /btrfs
> $ btrfs fi show
> Label: none  uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
>     Total devices 2 FS bytes used 663.81MiB
>     devid    1 size 1.10GiB used 1.10GiB path /dev/sdf
>     devid    2 size 1.10GiB used 1.08GiB path /dev/sdc
>
> $ devmgt show
> host0 sda
> host1 sdf
> host2 sdc
> host3 sdd
> host4 sde
> $ devmgt detach /dev/sdf
> -----/dev/kmsg----
>     sd 1:0:0:0: [sdf] Stopping disk
>      SUBSYSTEM=scsi
>      DEVICE=+scsi:1:0:0:0
>     ata2.00: disabled
> ------------------
> detach /dev/sdf successful
>
> (as a known bug btrfs kernel does not know device is missing, missing 
> flag isn't set, as shown below)
>
> $ btrfs-devlist
> fsid name uuid (seed_fsid sprout_fsid)
>     (fs_latest_devid fs_num_devices fs_open_devices fs_rw_devices 
> fs_missing_devices fs_total_devices) fs_total_rw_bytes 
> fs_num_can_discard fs_latest_trans
>     devid gen total_bytes disk_total_bytes bytes_used type io_align 
> io_width sector_size fmode
>     fs_flags
>     dev_flags
>
> dfbf136d-e8d2-489b-8ee1-be0d5999769d /dev/sdf 
> 13715cc5-3aeb-4523-b02c-a072fd427a00 (null null)
>     (2 2 2 2 0 2) 2363490304 0 7
>     1 5 1181745152 1181745152 1181745152 0 4096 4096 4096 0x83
>     fs_Mounted|not_fs_Seeding|fs_Rotating
>     Writable|MD|not_Missing|not_Discard|not_Replace_tgt|not_Run_pending|not_Nobarriers|Stat_valid|Stat_dirty|Bdev 
>
>
> dfbf136d-e8d2-489b-8ee1-be0d5999769d /dev/sdc 
> 12ad34f7-8d58-44fa-95cf-b2bbc0cec69d (null null)
>     (2 2 2 2 0 2) 2363490304 0 7
>     2 7 1181745152 1181745152 1160773632 0 4096 4096 4096 0x83
>     fs_Mounted|not_fs_Seeding|fs_Rotating
>     Writable|MD|not_Missing|not_Discard|not_Replace_tgt|not_Run_pending|not_Nobarriers|Stat_valid|Stat_dirty|Bdev 
>
>
>
> (below btrfs-progs patch added intelligence to tell the world that 
> device is missing)
>
> Ref:
> ~~~~~~~
> commit 2ae6a037efd52ae0fa30052d456ad07f074f5d54
> Author: Qu Wenruo <quwenruo@cn.fujitsu.com>
> Date:   Fri Feb 7 15:07:19 2014 +0800
>
>     btrfs-progs: Add missing devices check for mounted btrfs.
> ~~~~~~~
>
> $ btrfs fi show
> Label: none  uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
>     Total devices 2 FS bytes used 663.81MiB
>     devid    2 size 1.10GiB used 1.08GiB path /dev/sdc
>     *** Some devices missing
>
> $ btrfs dev add /dev/sde /btrfs
> $ btrfs fi show
> Label: none  uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
>     Total devices 3 FS bytes used 663.81MiB
>     devid    2 size 1.10GiB used 1.08GiB path /dev/sdc
>     devid    3 size 1.04GiB used 0.00 path /dev/sde
>     *** Some devices missing
>
>
> Now the bug is delete missing fails. Since kernel don't
> understand whats missing.
>
> $ btrfs dev del missing /btrfs
> ERROR: error removing the device 'missing' - no missing devices found 
> to remove
> $
> --------------------------------------------------------------------------- 
>
>
>
>
>
>
> On 02/10/2014 08:36 AM, Qu Wenruo wrote:
>> On Fri, 07 Feb 2014 17:34:46 +0800, Anand Jain wrote:
>>>
>>>
>>>  IMO btrfs-progs shouldn't add its intelligence to know if disk
>>>  is missing. If btrfs-kernel doesn't know when disk is missing
>>>  that's a bug to fix in btrfs-kernel. yes that indeed true as
>>>  of now in btrfs-kernel. btrfs kernel has no idea when disk
>>>  goes missing, just -EIO doesn't tell btrfs that. I am trying
>>>  to fix this first.
>>>
>>>  But the problem is there isn't good way with in btrfs/FS
>>>  to know when disk goes missing. did I miss anything ?
>> Yes, kernel detection is the best way.
>> But since it has no better way to detect missing device, I think the
>> btrfs-progs way fix is good enough for now.
>>
>> Since btrfs fi show with "-d" options will scan the /dev to find fs and
>> check missing disks,
>> I think adds some user-land check even using the ioctl way is still
>> somewhat reasonable.
>>
>> Thanks
>> Qu
>>>
>>>
>>> Thanks, Anand
>>>
>>>
>>> On 02/07/2014 02:45 PM, Qu Wenruo wrote:
>>>> In btrfs/003 of xfstest, it will check whether btrfs fi show can find
>>>> missing devices.
>>>>
>>>> But before the patch, btrfs-progs will not check whether device 
>>>> missing
>>>> if given a mounted btrfs mountpoint/block device.
>>>> This patch fixes the bug and will pass btrfs/003.
>>>>
>>>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>>> Cc: Anand Jain <anand.jain@oracle.com>
>>>> ---
>>>>   cmds-filesystem.c | 12 ++++++++++++
>>>>   1 file changed, 12 insertions(+)
>>>>
>>>> diff --git a/cmds-filesystem.c b/cmds-filesystem.c
>>>> index 384d1b9..4c9933d 100644
>>>> --- a/cmds-filesystem.c
>>>> +++ b/cmds-filesystem.c
>>>> @@ -363,6 +363,8 @@ static int print_one_fs(struct
>>>> btrfs_ioctl_fs_info_args *fs_info,
>>>>           char *label, char *path)
>>>>   {
>>>>       int i;
>>>> +    int fd;
>>>> +    int missing;
>>>>       char uuidbuf[BTRFS_UUID_UNPARSED_SIZE];
>>>>       struct btrfs_ioctl_dev_info_args *tmp_dev_info;
>>>>       int ret;
>>>> @@ -385,6 +387,14 @@ static int print_one_fs(struct
>>>> btrfs_ioctl_fs_info_args *fs_info,
>>>>
>>>>       for (i = 0; i < fs_info->num_devices; i++) {
>>>>           tmp_dev_info = (struct btrfs_ioctl_dev_info_args
>>>> *)&dev_info[i];
>>>> +
>>>> +        /* Add check for missing devices even mounted */
>>>> +        fd = open((char *)tmp_dev_info->path, O_RDONLY);
>>>> +        if (fd < 0) {
>>>> +            missing = 1;
>>>> +            continue;
>>>> +        }
>>>> +        close(fd);
>>>>           printf("\tdevid %4llu size %s used %s path %s\n",
>>>>               tmp_dev_info->devid,
>>>>               pretty_size(tmp_dev_info->total_bytes),
>>>> @@ -392,6 +402,8 @@ static int print_one_fs(struct
>>>> btrfs_ioctl_fs_info_args *fs_info,
>>>>               tmp_dev_info->path);
>>>>       }
>>>>
>>>> +    if (missing)
>>>> +        printf("\t*** Some devices missing\n");
>>>>       printf("\n");
>>>>       return 0;
>>>>   }
>>>>
>>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe 
>> linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain April 9, 2014, 4:33 a.m. UTC | #5
A bit of background of btrfs fi show.

As such original btrfs fi show had too many problems since btrfs-progs
wasn't much consulting btrfs-kernel to determine various
mounted device status. This was a serious problem sometime back.
Various patches brings btrfs-progs to communicate with kernel
and to tell the world the exact status as seen by the kernel.
And avoid any btrfs status fabrication in the user land for the
mounted disks.

The old code was move into the option "-d" and newer which
would print status as per the kernel comes under "-m"
which stands for mounted.

The cli btrfs fi show (with no option) would use both -m
and user-land (lblkid) for unmounted btrfs to show various
disks status.

But here this patch is again bring the old problem and idea into life,
that is it fabricates "missing" at the userland for the mounted btrfs.
As shown in the below testcase it has bad cascading consequences.

This patch must be taken out.

If your worry is about xfstests/btrfs/003 fails, yes it does, we
don't have the right solution as of now to fix it.


On 09/04/2014 11:26, Qu Wenruo wrote:
> Yes, the deleted device scan is still one of the deep problems yet.
>
> But my patch is not intented to deal anything related to the problem.


> For me, I am *only* going to deal with the *exit code* problem,
> 'btrfs fi show <device>' executes correctly(OK, only part of it exactlly),
> but exit code is still 1, which is the bug I'm trying to fix it.

Looks out of context, am I missing something ?

> IMO, the users/admins may never be interested in the inside mechanism
> nor algorithm,
> but the output and exit value things.

user trust btrfs-progs to tell them whats going on in the btrfs kernel.

there is no point in fabricating things (like missing) at the user land
when btrfs kernel isn't aware of it.

> This bug is much like the previous 'btrfs fi show' bugs that breaks some
> xfstests test case

I guess you are talking about the btrfs/003 test case, think the 
correct-way that a xfstest should have determined if the disk is
missing is to dump the btrfs_fs_devices -> btrfs_device and to check
if flag missing is set. xfstest trust btrfs-progs but btrfs-progs don't
do it.

If there is real test program which would check btrfs disk status from
the btrfs kernel memory then again this bug (missing flag is not set)
would exist.


> (always showing a error due to the scan_kernel_v2 function which calls a
> unimplemented ioctl interface),

The problem is its matching kernel patch is not integrated.


> if some patch breaks the old exit code or output, then it should be
> fixed to maintain the old
> output/exit code (except some big decision is made to change it).

Yes but fix it in the right way - notify btrfs kernel about the
missing disks, which would/should set missing flag in the btrfs_device
list.

> So for the output/exit code consistence, it should be fixed even the
> patch may not  means a cure but
> only a workaround for you.

There is no short cut. As shown your patch has the cascading problem.
When btrfs-kernel fixes this issue, And if David and Josef agrees a
patch on top of btrfs-devlist patch will help to fix this problem
once for all.

Thanks, Anand

> Thanks,
> Qu.
>
> ? 2014?04?09? 11:04, Anand Jain ??:
>>
>>
>>  Below shows the bug cascading to this patch.
>>
>>  And now to fix this I think we shouldn't fix/workaround in the
>>  btrfs-progs again!, fix it in the btrfs-kernel (or leave it open
>>  until suitable fix is found, I tried and failed. but don't fix it
>>  in a wrong way). If you want to help to fix this problem: Find out
>>  if we could get kobject notification with in kernel when disks gets
>>  disappeared.
>>
>>  I have been advocating btrfs-progs should _not_ add its intelligence
>>  and it should be as transparent as possible in showing the kernel's
>>  status. This should be seriously considered.
>>
>> (-----------
>>  For patches to take this approach the core problem here is different
>>  and hope we could correct it..
>>  First, we have a superficial and wrong measuring tape (xfstest) and
>>  we are trying to fix the product using it And in between is btrfs-progs
>>  which is trying to add more superficial-ness.
>>  2nd, btrfs Wiki has a theory and thus sets the direction that
>>  btrfs-progs would copy code from btrfs-kernel, I seriously doubt
>>  if that's a good idea.
>>  If you want to make btrfs-progs as intelligent as btrfs-kernel
>>  (which I don't understand why you should ?  since the purpose of
>>  btrfs-progs and btrfs-kernel are different) then first you need
>>  develop a mini synchronization mechanism between btrfs-progs and btrfs
>>  kernel which is as good as two active nodes FS which says from my
>>  experience with Solaris/SAM-QFS. Developing a synchronization
>>  mechanism is not in the plan here. Further from the End user
>>  Application (DB) performance perspective calling sync at the need of
>>  something like btrfs-progs is a very very bad idea. Applications would
>>  experience jitters in their steady state performance. Once Solaris had
>>  this issue and we fixed it.
>> -----------)
>>
>>  Have fun. ;-)
>>
>> ----------------------------------------------------------------
>> $ btrfs dev scan
>> Scanning for Btrfs filesystems
>> $ mount /dev/sdc /btrfs
>> $ btrfs fi show
>> Label: none  uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
>>     Total devices 2 FS bytes used 663.81MiB
>>     devid    1 size 1.10GiB used 1.10GiB path /dev/sdf
>>     devid    2 size 1.10GiB used 1.08GiB path /dev/sdc
>>
>> $ devmgt show
>> host0 sda
>> host1 sdf
>> host2 sdc
>> host3 sdd
>> host4 sde
>> $ devmgt detach /dev/sdf
>> -----/dev/kmsg----
>>     sd 1:0:0:0: [sdf] Stopping disk
>>      SUBSYSTEM=scsi
>>      DEVICE=+scsi:1:0:0:0
>>     ata2.00: disabled
>> ------------------
>> detach /dev/sdf successful
>>
>> (as a known bug btrfs kernel does not know device is missing, missing
>> flag isn't set, as shown below)
>>
>> $ btrfs-devlist
>> fsid name uuid (seed_fsid sprout_fsid)
>>     (fs_latest_devid fs_num_devices fs_open_devices fs_rw_devices
>> fs_missing_devices fs_total_devices) fs_total_rw_bytes
>> fs_num_can_discard fs_latest_trans
>>     devid gen total_bytes disk_total_bytes bytes_used type io_align
>> io_width sector_size fmode
>>     fs_flags
>>     dev_flags
>>
>> dfbf136d-e8d2-489b-8ee1-be0d5999769d /dev/sdf
>> 13715cc5-3aeb-4523-b02c-a072fd427a00 (null null)
>>     (2 2 2 2 0 2) 2363490304 0 7
>>     1 5 1181745152 1181745152 1181745152 0 4096 4096 4096 0x83
>>     fs_Mounted|not_fs_Seeding|fs_Rotating
>>
>> Writable|MD|not_Missing|not_Discard|not_Replace_tgt|not_Run_pending|not_Nobarriers|Stat_valid|Stat_dirty|Bdev
>>
>>
>> dfbf136d-e8d2-489b-8ee1-be0d5999769d /dev/sdc
>> 12ad34f7-8d58-44fa-95cf-b2bbc0cec69d (null null)
>>     (2 2 2 2 0 2) 2363490304 0 7
>>     2 7 1181745152 1181745152 1160773632 0 4096 4096 4096 0x83
>>     fs_Mounted|not_fs_Seeding|fs_Rotating
>>
>> Writable|MD|not_Missing|not_Discard|not_Replace_tgt|not_Run_pending|not_Nobarriers|Stat_valid|Stat_dirty|Bdev
>>
>>
>>
>> (below btrfs-progs patch added intelligence to tell the world that
>> device is missing)
>>
>> Ref:
>> ~~~~~~~
>> commit 2ae6a037efd52ae0fa30052d456ad07f074f5d54
>> Author: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> Date:   Fri Feb 7 15:07:19 2014 +0800
>>
>>     btrfs-progs: Add missing devices check for mounted btrfs.
>> ~~~~~~~
>>
>> $ btrfs fi show
>> Label: none  uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
>>     Total devices 2 FS bytes used 663.81MiB
>>     devid    2 size 1.10GiB used 1.08GiB path /dev/sdc
>>     *** Some devices missing
>>
>> $ btrfs dev add /dev/sde /btrfs
>> $ btrfs fi show
>> Label: none  uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
>>     Total devices 3 FS bytes used 663.81MiB
>>     devid    2 size 1.10GiB used 1.08GiB path /dev/sdc
>>     devid    3 size 1.04GiB used 0.00 path /dev/sde
>>     *** Some devices missing
>>
>>
>> Now the bug is delete missing fails. Since kernel don't
>> understand whats missing.
>>
>> $ btrfs dev del missing /btrfs
>> ERROR: error removing the device 'missing' - no missing devices found
>> to remove
>> $
>> ---------------------------------------------------------------------------
>>
>>
>>
>>
>>
>>
>> On 02/10/2014 08:36 AM, Qu Wenruo wrote:
>>> On Fri, 07 Feb 2014 17:34:46 +0800, Anand Jain wrote:
>>>>
>>>>
>>>>  IMO btrfs-progs shouldn't add its intelligence to know if disk
>>>>  is missing. If btrfs-kernel doesn't know when disk is missing
>>>>  that's a bug to fix in btrfs-kernel. yes that indeed true as
>>>>  of now in btrfs-kernel. btrfs kernel has no idea when disk
>>>>  goes missing, just -EIO doesn't tell btrfs that. I am trying
>>>>  to fix this first.
>>>>
>>>>  But the problem is there isn't good way with in btrfs/FS
>>>>  to know when disk goes missing. did I miss anything ?
>>> Yes, kernel detection is the best way.
>>> But since it has no better way to detect missing device, I think the
>>> btrfs-progs way fix is good enough for now.
>>>
>>> Since btrfs fi show with "-d" options will scan the /dev to find fs and
>>> check missing disks,
>>> I think adds some user-land check even using the ioctl way is still
>>> somewhat reasonable.
>>>
>>> Thanks
>>> Qu
>>>>
>>>>
>>>> Thanks, Anand
>>>>
>>>>
>>>> On 02/07/2014 02:45 PM, Qu Wenruo wrote:
>>>>> In btrfs/003 of xfstest, it will check whether btrfs fi show can find
>>>>> missing devices.
>>>>>
>>>>> But before the patch, btrfs-progs will not check whether device
>>>>> missing
>>>>> if given a mounted btrfs mountpoint/block device.
>>>>> This patch fixes the bug and will pass btrfs/003.
>>>>>
>>>>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>>>> Cc: Anand Jain <anand.jain@oracle.com>
>>>>> ---
>>>>>   cmds-filesystem.c | 12 ++++++++++++
>>>>>   1 file changed, 12 insertions(+)
>>>>>
>>>>> diff --git a/cmds-filesystem.c b/cmds-filesystem.c
>>>>> index 384d1b9..4c9933d 100644
>>>>> --- a/cmds-filesystem.c
>>>>> +++ b/cmds-filesystem.c
>>>>> @@ -363,6 +363,8 @@ static int print_one_fs(struct
>>>>> btrfs_ioctl_fs_info_args *fs_info,
>>>>>           char *label, char *path)
>>>>>   {
>>>>>       int i;
>>>>> +    int fd;
>>>>> +    int missing;
>>>>>       char uuidbuf[BTRFS_UUID_UNPARSED_SIZE];
>>>>>       struct btrfs_ioctl_dev_info_args *tmp_dev_info;
>>>>>       int ret;
>>>>> @@ -385,6 +387,14 @@ static int print_one_fs(struct
>>>>> btrfs_ioctl_fs_info_args *fs_info,
>>>>>
>>>>>       for (i = 0; i < fs_info->num_devices; i++) {
>>>>>           tmp_dev_info = (struct btrfs_ioctl_dev_info_args
>>>>> *)&dev_info[i];
>>>>> +
>>>>> +        /* Add check for missing devices even mounted */
>>>>> +        fd = open((char *)tmp_dev_info->path, O_RDONLY);
>>>>> +        if (fd < 0) {
>>>>> +            missing = 1;
>>>>> +            continue;
>>>>> +        }
>>>>> +        close(fd);
>>>>>           printf("\tdevid %4llu size %s used %s path %s\n",
>>>>>               tmp_dev_info->devid,
>>>>>               pretty_size(tmp_dev_info->total_bytes),
>>>>> @@ -392,6 +402,8 @@ static int print_one_fs(struct
>>>>> btrfs_ioctl_fs_info_args *fs_info,
>>>>>               tmp_dev_info->path);
>>>>>       }
>>>>>
>>>>> +    if (missing)
>>>>> +        printf("\t*** Some devices missing\n");
>>>>>       printf("\n");
>>>>>       return 0;
>>>>>   }
>>>>>
>>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo April 9, 2014, 6:55 a.m. UTC | #6
? 2014?04?09? 12:33, Anand Jain ??:
>
>
> A bit of background of btrfs fi show.
>
> As such original btrfs fi show had too many problems since btrfs-progs
> wasn't much consulting btrfs-kernel to determine various
> mounted device status. This was a serious problem sometime back.
> Various patches brings btrfs-progs to communicate with kernel
> and to tell the world the exact status as seen by the kernel.
> And avoid any btrfs status fabrication in the user land for the
> mounted disks.
>
> The old code was move into the option "-d" and newer which
> would print status as per the kernel comes under "-m"
> which stands for mounted.
>
> The cli btrfs fi show (with no option) would use both -m
> and user-land (lblkid) for unmounted btrfs to show various
> disks status.
>
> But here this patch is again bring the old problem and idea into life,
> that is it fabricates "missing" at the userland for the mounted btrfs.
> As shown in the below testcase it has bad cascading consequences.
>
> This patch must be taken out.
>
> If your worry is about xfstests/btrfs/003 fails, yes it does, we
> don't have the right solution as of now to fix it.

Yes, I'm still worry about the xfstests testcase.
Personally, I think you should try to remove the disk remove test in 
btrfs/003 first.
(Though I think it could be much harder to archieve it)

If xfstests devs can be persuaded and NOTE added to btrfs-progs document,
I'll be completely OK for the revert of this patch.

>
>
> On 09/04/2014 11:26, Qu Wenruo wrote:
>> Yes, the deleted device scan is still one of the deep problems yet.
>>
>> But my patch is not intented to deal anything related to the problem.
>
>
>> For me, I am *only* going to deal with the *exit code* problem,
>> 'btrfs fi show <device>' executes correctly(OK, only part of it 
>> exactlly),
>> but exit code is still 1, which is the bug I'm trying to fix it.
>
> Looks out of context, am I missing something ?

Sorry for getting mixed with the patch dealing with the exit code.
(Previous patch I sent)

>
>> IMO, the users/admins may never be interested in the inside mechanism
>> nor algorithm,
>> but the output and exit value things.
>
> user trust btrfs-progs to tell them whats going on in the btrfs kernel.

IMO user trust btrfs-progs to give them right info on btrfs, not care 
nor aware of the relationship
between kernel.

>
> there is no point in fabricating things (like missing) at the user land
> when btrfs kernel isn't aware of it.

 From the respect of normal users/admins, they does not cares how it is 
done but the result.
So if user finds missing devices still output in 'btrfs fi show', they 
will be confused.

Though I think any sane users/admins will call 'btrfs dev del' first,
so the problem still comes to the btrfs/003 test case.

>
>> This bug is much like the previous 'btrfs fi show' bugs that breaks some
>> xfstests test case
>
> I guess you are talking about the btrfs/003 test case, think the 
> correct-way that a xfstest should have determined if the disk is
> missing is to dump the btrfs_fs_devices -> btrfs_device and to check
> if flag missing is set. xfstest trust btrfs-progs but btrfs-progs don't
> do it.
>
> If there is real test program which would check btrfs disk status from
> the btrfs kernel memory then again this bug (missing flag is not set)
> would exist.

This bug is not the dev delete but the error message due to the 
unimplemented
btrfs ioctl things below.

>
>
>> (always showing a error due to the scan_kernel_v2 function which calls a
>> unimplemented ioctl interface),
>
> The problem is its matching kernel patch is not integrated.
>
>
>> if some patch breaks the old exit code or output, then it should be
>> fixed to maintain the old
>> output/exit code (except some big decision is made to change it).
>
> Yes but fix it in the right way - notify btrfs kernel about the
> missing disks, which would/should set missing flag in the btrfs_device
> list.

As I mentioned, this patch is just a workaround for testcase btrfs/003.
No mean to fix the whole things.

As mentioned at the beginning, I prefer to remove the device remove test
in testcase, then the patch can be reverted without any complain from me.

Thanks,
Qu.
>
>> So for the output/exit code consistence, it should be fixed even the
>> patch may not  means a cure but
>> only a workaround for you.
>
> There is no short cut. As shown your patch has the cascading problem.
> When btrfs-kernel fixes this issue, And if David and Josef agrees a
> patch on top of btrfs-devlist patch will help to fix this problem
> once for all.
>
> Thanks, Anand
>
>> Thanks,
>> Qu.
>>
>> ? 2014?04?09? 11:04, Anand Jain ??:
>>>
>>>
>>>  Below shows the bug cascading to this patch.
>>>
>>>  And now to fix this I think we shouldn't fix/workaround in the
>>>  btrfs-progs again!, fix it in the btrfs-kernel (or leave it open
>>>  until suitable fix is found, I tried and failed. but don't fix it
>>>  in a wrong way). If you want to help to fix this problem: Find out
>>>  if we could get kobject notification with in kernel when disks gets
>>>  disappeared.
>>>
>>>  I have been advocating btrfs-progs should _not_ add its intelligence
>>>  and it should be as transparent as possible in showing the kernel's
>>>  status. This should be seriously considered.
>>>
>>> (-----------
>>>  For patches to take this approach the core problem here is different
>>>  and hope we could correct it..
>>>  First, we have a superficial and wrong measuring tape (xfstest) and
>>>  we are trying to fix the product using it And in between is 
>>> btrfs-progs
>>>  which is trying to add more superficial-ness.
>>>  2nd, btrfs Wiki has a theory and thus sets the direction that
>>>  btrfs-progs would copy code from btrfs-kernel, I seriously doubt
>>>  if that's a good idea.
>>>  If you want to make btrfs-progs as intelligent as btrfs-kernel
>>>  (which I don't understand why you should ?  since the purpose of
>>>  btrfs-progs and btrfs-kernel are different) then first you need
>>>  develop a mini synchronization mechanism between btrfs-progs and btrfs
>>>  kernel which is as good as two active nodes FS which says from my
>>>  experience with Solaris/SAM-QFS. Developing a synchronization
>>>  mechanism is not in the plan here. Further from the End user
>>>  Application (DB) performance perspective calling sync at the need of
>>>  something like btrfs-progs is a very very bad idea. Applications would
>>>  experience jitters in their steady state performance. Once Solaris had
>>>  this issue and we fixed it.
>>> -----------)
>>>
>>>  Have fun. ;-)
>>>
>>> ----------------------------------------------------------------
>>> $ btrfs dev scan
>>> Scanning for Btrfs filesystems
>>> $ mount /dev/sdc /btrfs
>>> $ btrfs fi show
>>> Label: none  uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
>>>     Total devices 2 FS bytes used 663.81MiB
>>>     devid    1 size 1.10GiB used 1.10GiB path /dev/sdf
>>>     devid    2 size 1.10GiB used 1.08GiB path /dev/sdc
>>>
>>> $ devmgt show
>>> host0 sda
>>> host1 sdf
>>> host2 sdc
>>> host3 sdd
>>> host4 sde
>>> $ devmgt detach /dev/sdf
>>> -----/dev/kmsg----
>>>     sd 1:0:0:0: [sdf] Stopping disk
>>>      SUBSYSTEM=scsi
>>>      DEVICE=+scsi:1:0:0:0
>>>     ata2.00: disabled
>>> ------------------
>>> detach /dev/sdf successful
>>>
>>> (as a known bug btrfs kernel does not know device is missing, missing
>>> flag isn't set, as shown below)
>>>
>>> $ btrfs-devlist
>>> fsid name uuid (seed_fsid sprout_fsid)
>>>     (fs_latest_devid fs_num_devices fs_open_devices fs_rw_devices
>>> fs_missing_devices fs_total_devices) fs_total_rw_bytes
>>> fs_num_can_discard fs_latest_trans
>>>     devid gen total_bytes disk_total_bytes bytes_used type io_align
>>> io_width sector_size fmode
>>>     fs_flags
>>>     dev_flags
>>>
>>> dfbf136d-e8d2-489b-8ee1-be0d5999769d /dev/sdf
>>> 13715cc5-3aeb-4523-b02c-a072fd427a00 (null null)
>>>     (2 2 2 2 0 2) 2363490304 0 7
>>>     1 5 1181745152 1181745152 1181745152 0 4096 4096 4096 0x83
>>>     fs_Mounted|not_fs_Seeding|fs_Rotating
>>>
>>> Writable|MD|not_Missing|not_Discard|not_Replace_tgt|not_Run_pending|not_Nobarriers|Stat_valid|Stat_dirty|Bdev 
>>>
>>>
>>>
>>> dfbf136d-e8d2-489b-8ee1-be0d5999769d /dev/sdc
>>> 12ad34f7-8d58-44fa-95cf-b2bbc0cec69d (null null)
>>>     (2 2 2 2 0 2) 2363490304 0 7
>>>     2 7 1181745152 1181745152 1160773632 0 4096 4096 4096 0x83
>>>     fs_Mounted|not_fs_Seeding|fs_Rotating
>>>
>>> Writable|MD|not_Missing|not_Discard|not_Replace_tgt|not_Run_pending|not_Nobarriers|Stat_valid|Stat_dirty|Bdev 
>>>
>>>
>>>
>>>
>>> (below btrfs-progs patch added intelligence to tell the world that
>>> device is missing)
>>>
>>> Ref:
>>> ~~~~~~~
>>> commit 2ae6a037efd52ae0fa30052d456ad07f074f5d54
>>> Author: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>> Date:   Fri Feb 7 15:07:19 2014 +0800
>>>
>>>     btrfs-progs: Add missing devices check for mounted btrfs.
>>> ~~~~~~~
>>>
>>> $ btrfs fi show
>>> Label: none  uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
>>>     Total devices 2 FS bytes used 663.81MiB
>>>     devid    2 size 1.10GiB used 1.08GiB path /dev/sdc
>>>     *** Some devices missing
>>>
>>> $ btrfs dev add /dev/sde /btrfs
>>> $ btrfs fi show
>>> Label: none  uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
>>>     Total devices 3 FS bytes used 663.81MiB
>>>     devid    2 size 1.10GiB used 1.08GiB path /dev/sdc
>>>     devid    3 size 1.04GiB used 0.00 path /dev/sde
>>>     *** Some devices missing
>>>
>>>
>>> Now the bug is delete missing fails. Since kernel don't
>>> understand whats missing.
>>>
>>> $ btrfs dev del missing /btrfs
>>> ERROR: error removing the device 'missing' - no missing devices found
>>> to remove
>>> $
>>> --------------------------------------------------------------------------- 
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 02/10/2014 08:36 AM, Qu Wenruo wrote:
>>>> On Fri, 07 Feb 2014 17:34:46 +0800, Anand Jain wrote:
>>>>>
>>>>>
>>>>>  IMO btrfs-progs shouldn't add its intelligence to know if disk
>>>>>  is missing. If btrfs-kernel doesn't know when disk is missing
>>>>>  that's a bug to fix in btrfs-kernel. yes that indeed true as
>>>>>  of now in btrfs-kernel. btrfs kernel has no idea when disk
>>>>>  goes missing, just -EIO doesn't tell btrfs that. I am trying
>>>>>  to fix this first.
>>>>>
>>>>>  But the problem is there isn't good way with in btrfs/FS
>>>>>  to know when disk goes missing. did I miss anything ?
>>>> Yes, kernel detection is the best way.
>>>> But since it has no better way to detect missing device, I think the
>>>> btrfs-progs way fix is good enough for now.
>>>>
>>>> Since btrfs fi show with "-d" options will scan the /dev to find fs 
>>>> and
>>>> check missing disks,
>>>> I think adds some user-land check even using the ioctl way is still
>>>> somewhat reasonable.
>>>>
>>>> Thanks
>>>> Qu
>>>>>
>>>>>
>>>>> Thanks, Anand
>>>>>
>>>>>
>>>>> On 02/07/2014 02:45 PM, Qu Wenruo wrote:
>>>>>> In btrfs/003 of xfstest, it will check whether btrfs fi show can 
>>>>>> find
>>>>>> missing devices.
>>>>>>
>>>>>> But before the patch, btrfs-progs will not check whether device
>>>>>> missing
>>>>>> if given a mounted btrfs mountpoint/block device.
>>>>>> This patch fixes the bug and will pass btrfs/003.
>>>>>>
>>>>>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>>>>> Cc: Anand Jain <anand.jain@oracle.com>
>>>>>> ---
>>>>>>   cmds-filesystem.c | 12 ++++++++++++
>>>>>>   1 file changed, 12 insertions(+)
>>>>>>
>>>>>> diff --git a/cmds-filesystem.c b/cmds-filesystem.c
>>>>>> index 384d1b9..4c9933d 100644
>>>>>> --- a/cmds-filesystem.c
>>>>>> +++ b/cmds-filesystem.c
>>>>>> @@ -363,6 +363,8 @@ static int print_one_fs(struct
>>>>>> btrfs_ioctl_fs_info_args *fs_info,
>>>>>>           char *label, char *path)
>>>>>>   {
>>>>>>       int i;
>>>>>> +    int fd;
>>>>>> +    int missing;
>>>>>>       char uuidbuf[BTRFS_UUID_UNPARSED_SIZE];
>>>>>>       struct btrfs_ioctl_dev_info_args *tmp_dev_info;
>>>>>>       int ret;
>>>>>> @@ -385,6 +387,14 @@ static int print_one_fs(struct
>>>>>> btrfs_ioctl_fs_info_args *fs_info,
>>>>>>
>>>>>>       for (i = 0; i < fs_info->num_devices; i++) {
>>>>>>           tmp_dev_info = (struct btrfs_ioctl_dev_info_args
>>>>>> *)&dev_info[i];
>>>>>> +
>>>>>> +        /* Add check for missing devices even mounted */
>>>>>> +        fd = open((char *)tmp_dev_info->path, O_RDONLY);
>>>>>> +        if (fd < 0) {
>>>>>> +            missing = 1;
>>>>>> +            continue;
>>>>>> +        }
>>>>>> +        close(fd);
>>>>>>           printf("\tdevid %4llu size %s used %s path %s\n",
>>>>>>               tmp_dev_info->devid,
>>>>>>               pretty_size(tmp_dev_info->total_bytes),
>>>>>> @@ -392,6 +402,8 @@ static int print_one_fs(struct
>>>>>> btrfs_ioctl_fs_info_args *fs_info,
>>>>>>               tmp_dev_info->path);
>>>>>>       }
>>>>>>
>>>>>> +    if (missing)
>>>>>> +        printf("\t*** Some devices missing\n");
>>>>>>       printf("\n");
>>>>>>       return 0;
>>>>>>   }
>>>>>>
>>>>>
>>>>
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe
>>>> linux-btrfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe 
>>> linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe 
>> linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain April 9, 2014, 9:12 a.m. UTC | #7
> As mentioned at the beginning, I prefer to remove the device remove test
> in testcase, then the patch can be reverted without any complain from me.

  Thats a wrong approach as well.
  The test case 003 is a very real case at the data centers.
  Especially the iscsi/san luns go offline/online all the time.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index 384d1b9..4c9933d 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -363,6 +363,8 @@  static int print_one_fs(struct btrfs_ioctl_fs_info_args *fs_info,
 		char *label, char *path)
 {
 	int i;
+	int fd;
+	int missing;
 	char uuidbuf[BTRFS_UUID_UNPARSED_SIZE];
 	struct btrfs_ioctl_dev_info_args *tmp_dev_info;
 	int ret;
@@ -385,6 +387,14 @@  static int print_one_fs(struct btrfs_ioctl_fs_info_args *fs_info,
 
 	for (i = 0; i < fs_info->num_devices; i++) {
 		tmp_dev_info = (struct btrfs_ioctl_dev_info_args *)&dev_info[i];
+
+		/* Add check for missing devices even mounted */
+		fd = open((char *)tmp_dev_info->path, O_RDONLY);
+		if (fd < 0) {
+			missing = 1;
+			continue;
+		}
+		close(fd);
 		printf("\tdevid %4llu size %s used %s path %s\n",
 			tmp_dev_info->devid,
 			pretty_size(tmp_dev_info->total_bytes),
@@ -392,6 +402,8 @@  static int print_one_fs(struct btrfs_ioctl_fs_info_args *fs_info,
 			tmp_dev_info->path);
 	}
 
+	if (missing)
+		printf("\t*** Some devices missing\n");
 	printf("\n");
 	return 0;
 }