btrfs-progs: warn users about the possible dangers of check --repair
diff mbox series

Message ID 20191016140533.10583-1-jthumshirn@suse.de
State New
Headers show
Series
  • btrfs-progs: warn users about the possible dangers of check --repair
Related show

Commit Message

Johannes Thumshirn Oct. 16, 2019, 2:05 p.m. UTC
The manual page of btrfsck clearly states 'btrfs check --repair' is a
dangerous operation.

Although this warning is in place users do not read the manual page and/or
are used to the behaviour of fsck utilities which repair the filesystem,
and thus potentially cause harm.

Similar to 'btrfs balance' without any filters, add a warning and a
countdown, so users can bail out before eventual corrupting the filesystem
more than it already is.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 check/main.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

Qu Wenruo Oct. 16, 2019, 2:18 p.m. UTC | #1
On 2019/10/16 下午10:05, Johannes Thumshirn wrote:
> The manual page of btrfsck clearly states 'btrfs check --repair' is a
> dangerous operation.
> 
> Although this warning is in place users do not read the manual page and/or
> are used to the behaviour of fsck utilities which repair the filesystem,
> and thus potentially cause harm.
> 
> Similar to 'btrfs balance' without any filters, add a warning and a
> countdown, so users can bail out before eventual corrupting the filesystem
> more than it already is.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  check/main.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/check/main.c b/check/main.c
> index fd05430c1f51..acded927281a 100644
> --- a/check/main.c
> +++ b/check/main.c
> @@ -9970,6 +9970,23 @@ static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv)
>  		exit(1);
>  	}
>  
> +	if (repair) {
> +		int delay = 10;

Any delay would make the selftest miserably slow.

And in fact, recent btrfs check --repair is no longer that dangerous.
Sure, it still can't handle everything yet, but at least it's not making
things (that) worse.

Deadly bugs like the lack of flush/fua is already solved, so I'm not
100% sure if we still need such a big warning.

Thanks,
Qu

> +		printf("WARNING:\n\n");
> +		printf("\tDo not use --repair unless you are advised to do so by a developer\n");
> +		printf("\tor an experienced user, and then only after having accepted that no\n");
> +		printf("\tfsck successfully repair all types of filesystem corruption. Eg.\n");
> +		printf("\tsome other software or hardware bugs can fatally damage a volume.\n");
> +		printf("\tThe operation will start in %d seconds.\n", delay);
> +		printf("\tUse Ctrl-C to stop it.\n");
> +		while (delay) {
> +			printf("%2d", delay--);
> +			fflush(stdout);
> +			sleep(1);
> +		}
> +		printf("\nStarting repair.\n");
> +	}
> +
>  	/*
>  	 * experimental and dangerous
>  	 */
>
Nikolay Borisov Oct. 16, 2019, 2:31 p.m. UTC | #2
On 16.10.19 г. 17:05 ч., Johannes Thumshirn wrote:
> The manual page of btrfsck clearly states 'btrfs check --repair' is a
> dangerous operation.
> 
> Although this warning is in place users do not read the manual page and/or
> are used to the behaviour of fsck utilities which repair the filesystem,
> and thus potentially cause harm.
> 
> Similar to 'btrfs balance' without any filters, add a warning and a
> countdown, so users can bail out before eventual corrupting the filesystem
> more than it already is.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  check/main.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/check/main.c b/check/main.c
> index fd05430c1f51..acded927281a 100644
> --- a/check/main.c
> +++ b/check/main.c
> @@ -9970,6 +9970,23 @@ static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv)
>  		exit(1);
>  	}
>  
> +	if (repair) {
> +		int delay = 10;
> +		printf("WARNING:\n\n");
> +		printf("\tDo not use --repair unless you are advised to do so by a developer\n");
> +		printf("\tor an experienced user, and then only after having accepted that no\n");
> +		printf("\tfsck successfully repair all types of filesystem corruption. Eg.\n");
> +		printf("\tsome other software or hardware bugs can fatally damage a volume.\n");

nit: The word 'other' here is redundant, no ?

> +		printf("\tThe operation will start in %d seconds.\n", delay);
> +		printf("\tUse Ctrl-C to stop it.\n");
> +		while (delay) {
> +			printf("%2d", delay--);
> +			fflush(stdout);
> +			sleep(1);
> +		}

That's a long winded way to have a simple for  loop that prints 10 dots,
1 second apart.  IMO a better use experience would be to ask the user to
confirm and if the '-f' options i passed don't bother printing the
warning at all.

> +		printf("\nStarting repair.\n");
> +	}
> +
>  	/*
>  	 * experimental and dangerous
>  	 */
>
Anand Jain Oct. 17, 2019, 1:25 a.m. UTC | #3
On 10/16/19 10:31 PM, Nikolay Borisov wrote:
> 
> 
> On 16.10.19 г. 17:05 ч., Johannes Thumshirn wrote:
>> The manual page of btrfsck clearly states 'btrfs check --repair' is a
>> dangerous operation.
>>
>> Although this warning is in place users do not read the manual page and/or
>> are used to the behaviour of fsck utilities which repair the filesystem,
>> and thus potentially cause harm.
>>
>> Similar to 'btrfs balance' without any filters, add a warning and a
>> countdown, so users can bail out before eventual corrupting the filesystem
>> more than it already is.
>>
>> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
>> ---
>>   check/main.c | 17 +++++++++++++++++
>>   1 file changed, 17 insertions(+)
>>
>> diff --git a/check/main.c b/check/main.c
>> index fd05430c1f51..acded927281a 100644
>> --- a/check/main.c
>> +++ b/check/main.c
>> @@ -9970,6 +9970,23 @@ static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv)
>>   		exit(1);
>>   	}
>>   
>> +	if (repair) {
>> +		int delay = 10;
>> +		printf("WARNING:\n\n");
>> +		printf("\tDo not use --repair unless you are advised to do so by a developer\n");
>> +		printf("\tor an experienced user, and then only after having accepted that no\n");
>> +		printf("\tfsck successfully repair all types of filesystem corruption. Eg.\n");
>> +		printf("\tsome other software or hardware bugs can fatally damage a volume.\n");
> 
> nit: The word 'other' here is redundant, no ?
> 
>> +		printf("\tThe operation will start in %d seconds.\n", delay);
>> +		printf("\tUse Ctrl-C to stop it.\n");
>> +		while (delay) {
>> +			printf("%2d", delay--);
>> +			fflush(stdout);
>> +			sleep(1);
>> +		}
> 
> That's a long winded way to have a simple for  loop that prints 10 dots,
> 1 second apart.


>  IMO a better use experience would be to ask the user to
> confirm and if the '-f' options i passed don't bother printing the
> warning at all.

  Agreed. -f will suffice (at least make it non-default) is a good fix.
  But again as Qu pointed out our test cases will fail or old test case
  with new progs will fail.

Thanks, Anand

>> +		printf("\nStarting repair.\n");
>> +	}
>> +
>>   	/*
>>   	 * experimental and dangerous
>>   	 */
>>
Nikolay Borisov Oct. 17, 2019, 6:40 a.m. UTC | #4
On 17.10.19 г. 4:25 ч., Anand Jain wrote:
> On 10/16/19 10:31 PM, Nikolay Borisov wrote:
>>
>>
>> On 16.10.19 г. 17:05 ч., Johannes Thumshirn wrote:
>>> The manual page of btrfsck clearly states 'btrfs check --repair' is a
>>> dangerous operation.
>>>
>>> Although this warning is in place users do not read the manual page
>>> and/or
>>> are used to the behaviour of fsck utilities which repair the filesystem,
>>> and thus potentially cause harm.
>>>
>>> Similar to 'btrfs balance' without any filters, add a warning and a
>>> countdown, so users can bail out before eventual corrupting the
>>> filesystem
>>> more than it already is.
>>>
>>> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
>>> ---
>>>   check/main.c | 17 +++++++++++++++++
>>>   1 file changed, 17 insertions(+)
>>>
>>> diff --git a/check/main.c b/check/main.c
>>> index fd05430c1f51..acded927281a 100644
>>> --- a/check/main.c
>>> +++ b/check/main.c
>>> @@ -9970,6 +9970,23 @@ static int cmd_check(const struct cmd_struct
>>> *cmd, int argc, char **argv)
>>>           exit(1);
>>>       }
>>>   +    if (repair) {
>>> +        int delay = 10;
>>> +        printf("WARNING:\n\n");
>>> +        printf("\tDo not use --repair unless you are advised to do
>>> so by a developer\n");
>>> +        printf("\tor an experienced user, and then only after having
>>> accepted that no\n");
>>> +        printf("\tfsck successfully repair all types of filesystem
>>> corruption. Eg.\n");
>>> +        printf("\tsome other software or hardware bugs can fatally
>>> damage a volume.\n");
>>
>> nit: The word 'other' here is redundant, no ?
>>
>>> +        printf("\tThe operation will start in %d seconds.\n", delay);
>>> +        printf("\tUse Ctrl-C to stop it.\n");
>>> +        while (delay) {
>>> +            printf("%2d", delay--);
>>> +            fflush(stdout);
>>> +            sleep(1);
>>> +        }
>>
>> That's a long winded way to have a simple for  loop that prints 10 dots,
>> 1 second apart.
> 
> 
>>  IMO a better use experience would be to ask the user to
>> confirm and if the '-f' options i passed don't bother printing the
>> warning at all.
> 
>  Agreed. -f will suffice (at least make it non-default) is a good fix.
>  But again as Qu pointed out our test cases will fail or old test case
>  with new progs will fail.

They could be adjusted accordingly to always append the -f flag when
running --repair. After all when running tests we do expect to be able
to fix everything, no ?

> 
> Thanks, Anand
> 
>>> +        printf("\nStarting repair.\n");
>>> +    }
>>> +
>>>       /*
>>>        * experimental and dangerous
>>>        */
>>>
> 
>
Johannes Thumshirn Oct. 17, 2019, 7:10 a.m. UTC | #5
On 17/10/2019 08:40, Nikolay Borisov wrote:
[...]
>>  Agreed. -f will suffice (at least make it non-default) is a good fix.
>>  But again as Qu pointed out our test cases will fail or old test case
>>  with new progs will fail.
> 
> They could be adjusted accordingly to always append the -f flag when
> running --repair. After all when running tests we do expect to be able
> to fix everything, no ?

Agreed
Johannes Thumshirn Oct. 17, 2019, 7:13 a.m. UTC | #6
On 16/10/2019 16:31, Nikolay Borisov wrote:
[...]
>> +		printf("\tfsck successfully repair all types of filesystem corruption. Eg.\n");
>> +		printf("\tsome other software or hardware bugs can fatally damage a volume.\n");
> 
> nit: The word 'other' here is redundant, no ?

Hmm really, maybe. But the sentence above lacks a 'can' (I'll fix it up
in the manpage as well.

> 
>> +		printf("\tThe operation will start in %d seconds.\n", delay);
>> +		printf("\tUse Ctrl-C to stop it.\n");
>> +		while (delay) {
>> +			printf("%2d", delay--);
>> +			fflush(stdout);
>> +			sleep(1);
>> +		}
> 
> That's a long winded way to have a simple for  loop that prints 10 dots,
> 1 second apart.  IMO a better use experience would be to ask the user to
> confirm and if the '-f' options i passed don't bother printing the
> warning at all.

That's just copy & paste from cmds/balance.c

Patch
diff mbox series

diff --git a/check/main.c b/check/main.c
index fd05430c1f51..acded927281a 100644
--- a/check/main.c
+++ b/check/main.c
@@ -9970,6 +9970,23 @@  static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv)
 		exit(1);
 	}
 
+	if (repair) {
+		int delay = 10;
+		printf("WARNING:\n\n");
+		printf("\tDo not use --repair unless you are advised to do so by a developer\n");
+		printf("\tor an experienced user, and then only after having accepted that no\n");
+		printf("\tfsck successfully repair all types of filesystem corruption. Eg.\n");
+		printf("\tsome other software or hardware bugs can fatally damage a volume.\n");
+		printf("\tThe operation will start in %d seconds.\n", delay);
+		printf("\tUse Ctrl-C to stop it.\n");
+		while (delay) {
+			printf("%2d", delay--);
+			fflush(stdout);
+			sleep(1);
+		}
+		printf("\nStarting repair.\n");
+	}
+
 	/*
 	 * experimental and dangerous
 	 */