diff mbox

[v2] xfs_repair: update the manual content about xfs_repair exit status

Message ID 1473782076-9137-1-git-send-email-zlang@redhat.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Zorro Lang Sept. 13, 2016, 3:54 p.m. UTC
The man 8 xfs_repair said "xfs_repair run without the -n option will
always return a status code of 0". That's not correct.

xfs_repair will return 2 if it finds a fs log which needs to be
replayed or cleared, 1 if runtime error is encountered, and 0 for
all other cases.

Signed-off-by: Zorro Lang <zlang@redhat.com>
---

Hi,

V2 patch did below things:
 - change the description for xfs_repair
 - remove the description for "xfs_repair -L"

Thanks,
Zorro

 man/man8/xfs_repair.8 | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Eric Sandeen Sept. 13, 2016, 4:17 p.m. UTC | #1
On 9/13/16 10:54 AM, Zorro Lang wrote:
> The man 8 xfs_repair said "xfs_repair run without the -n option will
> always return a status code of 0". That's not correct.
> 
> xfs_repair will return 2 if it finds a fs log which needs to be
> replayed or cleared, 1 if runtime error is encountered, and 0 for
> all other cases.
> 
> Signed-off-by: Zorro Lang <zlang@redhat.com>
> ---
> 
> Hi,
> 
> V2 patch did below things:
>  - change the description for xfs_repair
>  - remove the description for "xfs_repair -L"
> 
> Thanks,
> Zorro
> 
>  man/man8/xfs_repair.8 | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
> index 1b4d9e3..e45fd90 100644
> --- a/man/man8/xfs_repair.8
> +++ b/man/man8/xfs_repair.8
> @@ -504,12 +504,17 @@ that is known to be free. The entry is therefore invalid and is deleted.
>  This message refers to a large directory.
>  If the directory were small, the message would read "junking entry ...".
>  .SH EXIT STATUS
> +.TP
>  .B xfs_repair \-n
>  (no modify node)

s/node/mode/, maybe Dave can fix that on commit.  Sorry I missed it the first time.

>  will return a status of 1 if filesystem corruption was detected and
>  0 if no filesystem corruption was detected.
> +.TP
>  .B xfs_repair
> -run without the \-n option will always return a status code of 0.
> +run without the \-n option will return a status code of 2 if it finds a
> +filesystem log which needs to be replayed(by a mount/umount cycle) or

space after replayed

> +cleared(by -L option), 1 if a runtime error is encountered, and 0 in all

space after cleared

> +other cases, whether or not filesystem corruption was detected.

Yep, I think this is ok with the small fixes above.  I hope the "whether or
not" is not more confusing, but I think it probably clarifies.

Dave, if you don't mind the small fixups on the way in,

Reviewed-by: Eric Sandeen <sandeen@redhat.com>

Thanks,
-Eric


>  .SH BUGS
>  The filesystem to be checked and repaired must have been
>  unmounted cleanly using normal system administration procedures
>
Darrick J. Wong Sept. 13, 2016, 4:32 p.m. UTC | #2
On Tue, Sep 13, 2016 at 11:54:36PM +0800, Zorro Lang wrote:
> The man 8 xfs_repair said "xfs_repair run without the -n option will
> always return a status code of 0". That's not correct.
> 
> xfs_repair will return 2 if it finds a fs log which needs to be
> replayed or cleared, 1 if runtime error is encountered, and 0 for
> all other cases.
> 
> Signed-off-by: Zorro Lang <zlang@redhat.com>
> ---
> 
> Hi,
> 
> V2 patch did below things:
>  - change the description for xfs_repair
>  - remove the description for "xfs_repair -L"
> 
> Thanks,
> Zorro
> 
>  man/man8/xfs_repair.8 | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
> index 1b4d9e3..e45fd90 100644
> --- a/man/man8/xfs_repair.8
> +++ b/man/man8/xfs_repair.8
> @@ -504,12 +504,17 @@ that is known to be free. The entry is therefore invalid and is deleted.
>  This message refers to a large directory.
>  If the directory were small, the message would read "junking entry ...".
>  .SH EXIT STATUS
> +.TP
>  .B xfs_repair \-n
>  (no modify node)
>  will return a status of 1 if filesystem corruption was detected and
>  0 if no filesystem corruption was detected.
> +.TP
>  .B xfs_repair
> -run without the \-n option will always return a status code of 0.
> +run without the \-n option will return a status code of 2 if it finds a
> +filesystem log which needs to be replayed(by a mount/umount cycle) or
> +cleared(by -L option), 1 if a runtime error is encountered, and 0 in all
> +other cases, whether or not filesystem corruption was detected.

So... I'd rather the documentation about the return code reflect the
status of the filesystem -- 2 means "unclean log, replay it or zap it",
1 means "errors encountered, fs may not be correct", and 0 /should/ mean
"fs is correct".

OTOH I don't know for sure that xfs_repair always cleans up the fs on
the first try.  From my fuzzing experiments a few years ago this seems
to be the case nearly all the time (unlike e2fsck) but not 100%.  ISTR
asking Dave about this, and I think he said that the FS should be clean
if repair returns 0.  But I'll let him reiterate that if it's true;
don't trust my crummy memory, that's why I have filesystems. ;)

--D

>  .SH BUGS
>  The filesystem to be checked and repaired must have been
>  unmounted cleanly using normal system administration procedures
> -- 
> 2.7.4
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
Eric Sandeen Sept. 13, 2016, 4:57 p.m. UTC | #3
On 9/13/16 11:32 AM, Darrick J. Wong wrote:
> On Tue, Sep 13, 2016 at 11:54:36PM +0800, Zorro Lang wrote:
>> The man 8 xfs_repair said "xfs_repair run without the -n option will
>> always return a status code of 0". That's not correct.
>>
>> xfs_repair will return 2 if it finds a fs log which needs to be
>> replayed or cleared, 1 if runtime error is encountered, and 0 for
>> all other cases.
>>
>> Signed-off-by: Zorro Lang <zlang@redhat.com>
>> ---
>>
>> Hi,
>>
>> V2 patch did below things:
>>  - change the description for xfs_repair
>>  - remove the description for "xfs_repair -L"
>>
>> Thanks,
>> Zorro
>>
>>  man/man8/xfs_repair.8 | 7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
>> index 1b4d9e3..e45fd90 100644
>> --- a/man/man8/xfs_repair.8
>> +++ b/man/man8/xfs_repair.8
>> @@ -504,12 +504,17 @@ that is known to be free. The entry is therefore invalid and is deleted.
>>  This message refers to a large directory.
>>  If the directory were small, the message would read "junking entry ...".
>>  .SH EXIT STATUS
>> +.TP
>>  .B xfs_repair \-n
>>  (no modify node)
>>  will return a status of 1 if filesystem corruption was detected and
>>  0 if no filesystem corruption was detected.
>> +.TP
>>  .B xfs_repair
>> -run without the \-n option will always return a status code of 0.
>> +run without the \-n option will return a status code of 2 if it finds a
>> +filesystem log which needs to be replayed(by a mount/umount cycle) or
>> +cleared(by -L option), 1 if a runtime error is encountered, and 0 in all
>> +other cases, whether or not filesystem corruption was detected.
> 
> So... I'd rather the documentation about the return code reflect the
> status of the filesystem -- 2 means "unclean log, replay it or zap it",
> 1 means "errors encountered, fs may not be correct", and 0 /should/ mean
> "fs is correct".
> 
> OTOH I don't know for sure that xfs_repair always cleans up the fs on
> the first try.

That's certainly the intent; I can't imagine a manpage documenting
return codes qualified with "... unless bugs happen." :)

>  From my fuzzing experiments a few years ago this seems
> to be the case nearly all the time (unlike e2fsck) but not 100%.

Same here, I fixed what I found...

>  ISTR
> asking Dave about this, and I think he said that the FS should be clean
> if repair returns 0.  But I'll let him reiterate that if it's true;
> don't trust my crummy memory, that's why I have filesystems. ;)

Did you have an alternate wording in mind?

-Eric

> --D
> 
>>  .SH BUGS
>>  The filesystem to be checked and repaired must have been
>>  unmounted cleanly using normal system administration procedures
>> -- 
>> 2.7.4
>>
>> _______________________________________________
>> xfs mailing list
>> xfs@oss.sgi.com
>> http://oss.sgi.com/mailman/listinfo/xfs
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
Dave Chinner Sept. 13, 2016, 9:48 p.m. UTC | #4
On Tue, Sep 13, 2016 at 11:57:59AM -0500, Eric Sandeen wrote:
> On 9/13/16 11:32 AM, Darrick J. Wong wrote:
> > On Tue, Sep 13, 2016 at 11:54:36PM +0800, Zorro Lang wrote:
> >> The man 8 xfs_repair said "xfs_repair run without the -n option will
> >> always return a status code of 0". That's not correct.
> >>
> >> xfs_repair will return 2 if it finds a fs log which needs to be
> >> replayed or cleared, 1 if runtime error is encountered, and 0 for
> >> all other cases.
> >>
> >> Signed-off-by: Zorro Lang <zlang@redhat.com>
> >> ---
> >>
> >> Hi,
> >>
> >> V2 patch did below things:
> >>  - change the description for xfs_repair
> >>  - remove the description for "xfs_repair -L"
> >>
> >> Thanks,
> >> Zorro
> >>
> >>  man/man8/xfs_repair.8 | 7 ++++++-
> >>  1 file changed, 6 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
> >> index 1b4d9e3..e45fd90 100644
> >> --- a/man/man8/xfs_repair.8
> >> +++ b/man/man8/xfs_repair.8
> >> @@ -504,12 +504,17 @@ that is known to be free. The entry is therefore invalid and is deleted.
> >>  This message refers to a large directory.
> >>  If the directory were small, the message would read "junking entry ...".
> >>  .SH EXIT STATUS
> >> +.TP
> >>  .B xfs_repair \-n
> >>  (no modify node)
> >>  will return a status of 1 if filesystem corruption was detected and
> >>  0 if no filesystem corruption was detected.
> >> +.TP
> >>  .B xfs_repair
> >> -run without the \-n option will always return a status code of 0.
> >> +run without the \-n option will return a status code of 2 if it finds a
> >> +filesystem log which needs to be replayed(by a mount/umount cycle) or
> >> +cleared(by -L option), 1 if a runtime error is encountered, and 0 in all
> >> +other cases, whether or not filesystem corruption was detected.
> > 
> > So... I'd rather the documentation about the return code reflect the
> > status of the filesystem -- 2 means "unclean log, replay it or zap it",
> > 1 means "errors encountered, fs may not be correct", and 0 /should/ mean
> > "fs is correct".
> > 
> > OTOH I don't know for sure that xfs_repair always cleans up the fs on
> > the first try.
> 
> That's certainly the intent; I can't imagine a manpage documenting
> return codes qualified with "... unless bugs happen." :)

Right - if we hit bugs, all bets are off. But otherwise, the fs
should be repaired and clean after a single pass.

> >  ISTR
> > asking Dave about this, and I think he said that the FS should be clean
> > if repair returns 0.  But I'll let him reiterate that if it's true;
> > don't trust my crummy memory, that's why I have filesystems. ;)
> 
> Did you have an alternate wording in mind?

Yup, 0 = " fs is clean", 1 = "fs is still b0rken",
2 = "couldn't run for whatever reason given"

Cheers,

Dave.
Eric Sandeen Sept. 13, 2016, 9:52 p.m. UTC | #5
On 9/13/16 4:48 PM, Dave Chinner wrote:
> On Tue, Sep 13, 2016 at 11:57:59AM -0500, Eric Sandeen wrote:
>> On 9/13/16 11:32 AM, Darrick J. Wong wrote:

...

>>> So... I'd rather the documentation about the return code reflect the
>>> status of the filesystem -- 2 means "unclean log, replay it or zap it",
>>> 1 means "errors encountered, fs may not be correct", and 0 /should/ mean
>>> "fs is correct".
>>>
>>> OTOH I don't know for sure that xfs_repair always cleans up the fs on
>>> the first try.
>>
>> That's certainly the intent; I can't imagine a manpage documenting
>> return codes qualified with "... unless bugs happen." :)
> 
> Right - if we hit bugs, all bets are off. But otherwise, the fs
> should be repaired and clean after a single pass.
> 
>>>  ISTR
>>> asking Dave about this, and I think he said that the FS should be clean
>>> if repair returns 0.  But I'll let him reiterate that if it's true;
>>> don't trust my crummy memory, that's why I have filesystems. ;)
>>
>> Did you have an alternate wording in mind?
> 
> Yup, 0 = " fs is clean", 1 = "fs is still b0rken",
> 2 = "couldn't run for whatever reason given"

Technically, 1 = "may or may not be broken" - we really don't know.
We could get an exit of 1 for a consistent filesystem, for example
if some allocation failed... all we know is something bonked out in
the middle.

Maybe "1 == xfs_repair did not run to completion?"

-Eric
Dave Chinner Sept. 14, 2016, 1:34 a.m. UTC | #6
On Tue, Sep 13, 2016 at 04:52:32PM -0500, Eric Sandeen wrote:
> 
> 
> On 9/13/16 4:48 PM, Dave Chinner wrote:
> > On Tue, Sep 13, 2016 at 11:57:59AM -0500, Eric Sandeen wrote:
> >> On 9/13/16 11:32 AM, Darrick J. Wong wrote:
> 
> ...
> 
> >>> So... I'd rather the documentation about the return code reflect the
> >>> status of the filesystem -- 2 means "unclean log, replay it or zap it",
> >>> 1 means "errors encountered, fs may not be correct", and 0 /should/ mean
> >>> "fs is correct".
> >>>
> >>> OTOH I don't know for sure that xfs_repair always cleans up the fs on
> >>> the first try.
> >>
> >> That's certainly the intent; I can't imagine a manpage documenting
> >> return codes qualified with "... unless bugs happen." :)
> > 
> > Right - if we hit bugs, all bets are off. But otherwise, the fs
> > should be repaired and clean after a single pass.
> > 
> >>>  ISTR
> >>> asking Dave about this, and I think he said that the FS should be clean
> >>> if repair returns 0.  But I'll let him reiterate that if it's true;
> >>> don't trust my crummy memory, that's why I have filesystems. ;)
> >>
> >> Did you have an alternate wording in mind?
> > 
> > Yup, 0 = " fs is clean", 1 = "fs is still b0rken",
> > 2 = "couldn't run for whatever reason given"
> 
> Technically, 1 = "may or may not be broken" - we really don't know.
> We could get an exit of 1 for a consistent filesystem, for example
> if some allocation failed... all we know is something bonked out in
> the middle.
> 
> Maybe "1 == xfs_repair did not run to completion?"

Well, if it fails part way through phase 5, then the filesystem is
most definitely broken, even if it was clean to begin with. i.e.
repair, even when the filesystem is clean, will rebuild parts of the
filesystem from scratch.

And repair nulls out directory entries in phase 4 and doesn't
rebuild those directories till phase 6, so between those points the
filesystem is actually in a corrupt state that requires repair.
hence there is a large scope where a failure in repair really does
mean that we need to run repair again. Hence I think it's simply
safer to explicitly document it as:

	"1 == fs may be even more broken than before repair started,
	so repair needs to be run again"

because "did not run to completion" does not really tell the user
what to do when it occurs.

Cheers,

Dave.
diff mbox

Patch

diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
index 1b4d9e3..e45fd90 100644
--- a/man/man8/xfs_repair.8
+++ b/man/man8/xfs_repair.8
@@ -504,12 +504,17 @@  that is known to be free. The entry is therefore invalid and is deleted.
 This message refers to a large directory.
 If the directory were small, the message would read "junking entry ...".
 .SH EXIT STATUS
+.TP
 .B xfs_repair \-n
 (no modify node)
 will return a status of 1 if filesystem corruption was detected and
 0 if no filesystem corruption was detected.
+.TP
 .B xfs_repair
-run without the \-n option will always return a status code of 0.
+run without the \-n option will return a status code of 2 if it finds a
+filesystem log which needs to be replayed(by a mount/umount cycle) or
+cleared(by -L option), 1 if a runtime error is encountered, and 0 in all
+other cases, whether or not filesystem corruption was detected.
 .SH BUGS
 The filesystem to be checked and repaired must have been
 unmounted cleanly using normal system administration procedures