[v2,3/3] btrfs-progs: device: add remove missing-all
diff mbox

Message ID be3e6086-5a2f-77d0-5dbc-dcb5a1a57815@jp.fujitsu.com
State New
Headers show

Commit Message

Misono Tomohiro Oct. 11, 2017, 2:18 a.m. UTC
Add 'btrfs remove missing-all' to remove all the missing devices
at once for improving usability.

Example:
 sudo mkfs.btrfs -f -d raid1 /dev/sdb1 /dev/sdb2 /dev/sdb3 /dev/sdb4
 sudo wipefs -a /dev/sdb1 /dev/sdb3
 sudo mount -o degraded /dev/sdb2 /mnt
 sudo btrfs filesystem show /mnt
 sudo btrfs device remove missing-all /mnt
 sudo btrfs filesystem show /mnt

Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
---
 Documentation/btrfs-device.asciidoc |  1 +
 cmds-device.c                       | 24 +++++++++++++++++++++++-
 2 files changed, 24 insertions(+), 1 deletion(-)

Comments

Duncan Oct. 13, 2017, 5:27 a.m. UTC | #1
Misono, Tomohiro posted on Wed, 11 Oct 2017 11:18:50 +0900 as excerpted:

> Add 'btrfs remove missing-all' to remove all the missing devices at once
> for improving usability.
> 
> Example:
>  sudo mkfs.btrfs -f -d raid1 /dev/sdb1 /dev/sdb2 /dev/sdb3 /dev/sdb4
>  sudo wipefs -a /dev/sdb1 /dev/sdb3
>  sudo mount -o degraded /dev/sdb2 /mnt
>  sudo btrfs filesystem show /mnt
>  sudo btrfs device remove missing-all /mnt
>  sudo btrfs filesystem show /mnt


There's a reason remove missing-all hasn't yet been implemented.

Note that the above would be very unlikely to work once a filesystem has 
been used in any significant way, because raid1 and raid10 are explicitly 
chunk pairs, *NOT* duplicated N times across N devices.  So with two 
devices missing, chances are that both copies of some chunks will be 
missing as well, so the filesystem would no longer be mountable degraded-
writable, only degraded-readonly, in which case device remove won't work 
at all because the filesystem is readonly.

In fact, until the recent per-chunk check patches went in, it was 
impossible to mount-writable a raid1 missing two devices at all, because 
the safeguards simply assumed some chunks would be entirely missing.

The only case in which more than a single device missing is likely to be 
mountable degraded-writable (so device remove will work at all) is raid6, 
tho with recent patches there's narrow cases in which it /might/ be 
doable with raid1 as well.

Now you may still wish to implement remove missing-all for raid6 mode and 
for the unusual corner-case raid1/raid10 in which it might work, but the 
documentation should be pretty clear that save for raid6 it can't be 
expected to work in most cases.

Given that, I think remove missing-all hasn't been implemented as it 
simply hasn't been considered to be worth the bother for the narrow use-
cases in which it will actually work.
Anand Jain Oct. 16, 2017, 3:30 a.m. UTC | #2
On 10/13/2017 01:27 PM, Duncan wrote:
> Misono, Tomohiro posted on Wed, 11 Oct 2017 11:18:50 +0900 as excerpted:
> 
>> Add 'btrfs remove missing-all' to remove all the missing devices at once
>> for improving usability.
>>
>> Example:
>>   sudo mkfs.btrfs -f -d raid1 /dev/sdb1 /dev/sdb2 /dev/sdb3 /dev/sdb4
>>   sudo wipefs -a /dev/sdb1 /dev/sdb3
>>   sudo mount -o degraded /dev/sdb2 /mnt <-- 


   I agree with Duncan here. This step itself will fail even with RO
   option. Do you have any patch that is not in the ML which will
   make this step a success in the first place ?

Thanks, Anand


>>   sudo btrfs filesystem show /mnt
>>   sudo btrfs device remove missing-all /mnt
>>   sudo btrfs filesystem show /mnt
> 
> 
> There's a reason remove missing-all hasn't yet been implemented.
> 
> Note that the above would be very unlikely to work once a filesystem has
> been used in any significant way, because raid1 and raid10 are explicitly
> chunk pairs, *NOT* duplicated N times across N devices.  So with two
> devices missing, chances are that both copies of some chunks will be
> missing as well, so the filesystem would no longer be mountable degraded-
> writable, only degraded-readonly, in which case device remove won't work
> at all because the filesystem is readonly.
> 
> In fact, until the recent per-chunk check patches went in, it was
> impossible to mount-writable a raid1 missing two devices at all, because
> the safeguards simply assumed some chunks would be entirely missing.
> 
> The only case in which more than a single device missing is likely to be
> mountable degraded-writable (so device remove will work at all) is raid6,
> tho with recent patches there's narrow cases in which it /might/ be
> doable with raid1 as well.
> 
> Now you may still wish to implement remove missing-all for raid6 mode and
> for the unusual corner-case raid1/raid10 in which it might work, but the
> documentation should be pretty clear that save for raid6 it can't be
> expected to work in most cases.
> 
> Given that, I think remove missing-all hasn't been implemented as it
> simply hasn't been considered to be worth the bother for the narrow use-
> cases in which it will actually work.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Misono Tomohiro Oct. 16, 2017, 4:32 a.m. UTC | #3
On 2017/10/13 14:27, Duncan wrote:
> Misono, Tomohiro posted on Wed, 11 Oct 2017 11:18:50 +0900 as excerpted:
> 
>> Add 'btrfs remove missing-all' to remove all the missing devices at once
>> for improving usability.
>>
>> Example:
>>  sudo mkfs.btrfs -f -d raid1 /dev/sdb1 /dev/sdb2 /dev/sdb3 /dev/sdb4
>>  sudo wipefs -a /dev/sdb1 /dev/sdb3
>>  sudo mount -o degraded /dev/sdb2 /mnt
>>  sudo btrfs filesystem show /mnt
>>  sudo btrfs device remove missing-all /mnt
>>  sudo btrfs filesystem show /mnt
> 
> 
> There's a reason remove missing-all hasn't yet been implemented.
> 
> Note that the above would be very unlikely to work once a filesystem has 
> been used in any significant way, because raid1 and raid10 are explicitly 
> chunk pairs, *NOT* duplicated N times across N devices.  So with two 
> devices missing, chances are that both copies of some chunks will be 
> missing as well, so the filesystem would no longer be mountable degraded-
> writable, only degraded-readonly, in which case device remove won't work 
> at all because the filesystem is readonly.
> 
> In fact, until the recent per-chunk check patches went in, it was 
> impossible to mount-writable a raid1 missing two devices at all, because 
> the safeguards simply assumed some chunks would be entirely missing.
> 
> The only case in which more than a single device missing is likely to be 
> mountable degraded-writable (so device remove will work at all) is raid6, 
> tho with recent patches there's narrow cases in which it /might/ be 
> doable with raid1 as well.
> 
> Now you may still wish to implement remove missing-all for raid6 mode and 
> for the unusual corner-case raid1/raid10 in which it might work, but the 
> documentation should be pretty clear that save for raid6 it can't be 
> expected to work in most cases.
> 
> Given that, I think remove missing-all hasn't been implemented as it 
> simply hasn't been considered to be worth the bother for the narrow use-
> cases in which it will actually work.

Thanks for the comments.

I thought this is useful, but agree that this is for rare case and might be
confusing. So, I will drop 3rd patch and just resend 1st/2nd again.

Regards,
Tomohiro

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Misono Tomohiro Oct. 16, 2017, 4:35 a.m. UTC | #4
On 2017/10/16 12:30, Anand Jain wrote:
> 
> 
> On 10/13/2017 01:27 PM, Duncan wrote:
>> Misono, Tomohiro posted on Wed, 11 Oct 2017 11:18:50 +0900 as excerpted:
>>
>>> Add 'btrfs remove missing-all' to remove all the missing devices at once
>>> for improving usability.
>>>
>>> Example:
>>>   sudo mkfs.btrfs -f -d raid1 /dev/sdb1 /dev/sdb2 /dev/sdb3 /dev/sdb4
>>>   sudo wipefs -a /dev/sdb1 /dev/sdb3
>>>   sudo mount -o degraded /dev/sdb2 /mnt <-- 
> 
> 
>    I agree with Duncan here. This step itself will fail even with RO
>    option. Do you have any patch that is not in the ML which will
>    make this step a success in the first place ?
> 
> Thanks, Anand
> 

commit 21634a19f646 ("btrfs: Introduce a function to check if all chunks a OK
for degraded rw mount") allow this from 4.14 (I checked on 4.14-rc4).
But I will withdraw this patch as Duncan suggests.

Thanks,
Tomohiro

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain Oct. 16, 2017, 6:46 a.m. UTC | #5
On 10/16/2017 12:35 PM, Misono, Tomohiro wrote:
> 
> 
> On 2017/10/16 12:30, Anand Jain wrote:
>>
>>
>> On 10/13/2017 01:27 PM, Duncan wrote:
>>> Misono, Tomohiro posted on Wed, 11 Oct 2017 11:18:50 +0900 as excerpted:
>>>
>>>> Add 'btrfs remove missing-all' to remove all the missing devices at once
>>>> for improving usability.
>>>>
>>>> Example:
>>>>    sudo mkfs.btrfs -f -d raid1 /dev/sdb1 /dev/sdb2 /dev/sdb3 /dev/sdb4
>>>>    sudo wipefs -a /dev/sdb1 /dev/sdb3
>>>>    sudo mount -o degraded /dev/sdb2 /mnt <--
>>
>>
>>     I agree with Duncan here. This step itself will fail even with RO
>>     option.

  Correction. RO mount is possible based on which disk you choose to mount.

Thanks - Anand


>> Do you have any patch that is not in the ML which will
>>     make this step a success in the first place ?
>>
>> Thanks, Anand
>>
> 
> commit 21634a19f646 ("btrfs: Introduce a function to check if all chunks a OK
> for degraded rw mount") allow this from 4.14 (I checked on 4.14-rc4).
> But I will withdraw this patch as Duncan suggests.
> 
> Thanks,
> Tomohiro
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Duncan Oct. 16, 2017, 2:16 p.m. UTC | #6
Misono, Tomohiro posted on Mon, 16 Oct 2017 13:35:08 +0900 as excerpted:

> On 2017/10/16 12:30, Anand Jain wrote:
>> 
>> 
>> On 10/13/2017 01:27 PM, Duncan wrote:
>>> Misono, Tomohiro posted on Wed, 11 Oct 2017 11:18:50 +0900 as
>>> excerpted:
>>>
>>>> Add 'btrfs remove missing-all' to remove all the missing devices at
>>>> once for improving usability.
>>>>
>>>> Example:
>>>>   sudo mkfs.btrfs -f -d raid1 /dev/sdb1 /dev/sdb2 /dev/sdb3 /dev/sdb4
>>>>   sudo wipefs -a /dev/sdb1 /dev/sdb3 sudo mount -o degraded /dev/sdb2
>>>>   /mnt <--
>> 
>> 
>>    I agree with Duncan here. This step itself will fail even with RO
>>    option. Do you have any patch that is not in the ML which will make
>>    this step a success in the first place ?
>> 
>> Thanks, Anand
>> 
>> 
> commit 21634a19f646 ("btrfs: Introduce a function to check if all chunks
> a OK for degraded rw mount") allow this from 4.14 (I checked on
> 4.14-rc4).

That's why I said recent patches allow it in corner-cases.  However, I 
think those corner-cases would I think be difficult to document concisely 
in the manpage, and without that, I believe the option would be more 
confusing than helpful, since people would expect it to actually work 
when it won't, for them.

> But I will withdraw this patch as Duncan suggests.

If someone comes up with a satisfactory way to explain at a manpage level 
the corner-cases in which removing more than one missing device at a time 
can work...

Patch
diff mbox

diff --git a/Documentation/btrfs-device.asciidoc b/Documentation/btrfs-device.asciidoc
index dd60415..f08d64d 100644
--- a/Documentation/btrfs-device.asciidoc
+++ b/Documentation/btrfs-device.asciidoc
@@ -79,6 +79,7 @@  lowest device id.
 If device is mounted as degraded mode (-o degraded), special term "missing"
 can be used for <device>. In that case, the first device that is described by
 the filesystem metadata, but not preseted at the mount time will be removed.
+Also, "missing-all" can be used to remove all the missing devices.
 
 *delete* <device>|<devid> [<device>|<devid>...] <path>::
 Alias of remove kept for backward compatibility
diff --git a/cmds-device.c b/cmds-device.c
index d28ed0f..507ad04 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -164,6 +164,8 @@  static int _cmd_device_remove(int argc, char **argv,
 		struct	btrfs_ioctl_vol_args arg;
 		struct btrfs_ioctl_vol_args_v2 argv2 = {0};
 		int is_devid = 0;
+		int is_missing_all = 0;
+		int num_missing = 0;
 		int	res;
 
 		if (string_is_numerical(argv[i])) {
@@ -173,12 +175,16 @@  static int _cmd_device_remove(int argc, char **argv,
 		} else if (is_block_device(argv[i]) == 1 ||
 				strcmp(argv[i], "missing") == 0) {
 			strncpy_null(argv2.name, argv[i]);
+		} else if (strcmp(argv[i], "missing-all") == 0) {
+			is_missing_all = 1;
+			strncpy_null(argv2.name, "missing");
 		} else {
 			error("not a block device: %s", argv[i]);
 			ret++;
 			continue;
 		}
 
+again:
 		/*
 		 * Positive values are from BTRFS_ERROR_DEV_*,
 		 * otherwise it's a generic error, one of errnos
@@ -202,6 +208,21 @@  static int _cmd_device_remove(int argc, char **argv,
 			res = ioctl(fdmnt, BTRFS_IOC_RM_DEV, &arg);
 		}
 
+		if (is_missing_all) {
+			if (!res) {
+				num_missing++;
+				goto again;
+			}
+
+			if (num_missing > 0 &&
+				(res == BTRFS_ERROR_DEV_RAID1_MIN_NOT_MET ||
+				 res == BTRFS_ERROR_DEV_RAID10_MIN_NOT_MET ||
+				 res == BTRFS_ERROR_DEV_RAID5_MIN_NOT_MET ||
+				 res == BTRFS_ERROR_DEV_RAID6_MIN_NOT_MET ||
+				 res == BTRFS_ERROR_DEV_MISSING_NOT_FOUND))
+				continue;
+		}
+
 		if (res) {
 			const char *msg;
 
@@ -228,7 +249,8 @@  static int _cmd_device_remove(int argc, char **argv,
 	"", \
 	"If 'missing' is specified for <device>, the first device that is", \
 	"described by the filesystem metadata, but not presented at the", \
-	"mount time will be removed."
+	"mount time will be removed. Use 'missing-all' to remove all the", \
+	"missing devices."
 
 static const char * const cmd_device_remove_usage[] = {
 	"btrfs device remove <device>|<devid> [<device>|<devid>...] <path>",