diff mbox series

[3/3] generic/757: fix various bugs in this test

Message ID 173146178859.156441.16666438727834100554.stgit@frogsfrogsfrogs (mailing list archive)
State New, archived
Headers show
Series [1/3] xfs/273: check thoroughness of the mappings | expand

Commit Message

Darrick J. Wong Nov. 13, 2024, 1:37 a.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

Fix this test so the check doesn't fail on XFS, and restrict runtime to
100 loops because otherwise this test takes many hours.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 tests/generic/757 |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig Nov. 13, 2024, 8:48 a.m. UTC | #1
On Tue, Nov 12, 2024 at 05:37:29PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Fix this test so the check doesn't fail on XFS, and restrict runtime to
> 100 loops because otherwise this test takes many hours.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Zorro Lang Nov. 14, 2024, 5:23 a.m. UTC | #2
On Tue, Nov 12, 2024 at 05:37:29PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Fix this test so the check doesn't fail on XFS, and restrict runtime to
> 100 loops because otherwise this test takes many hours.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  tests/generic/757 |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> 
> diff --git a/tests/generic/757 b/tests/generic/757
> index 0ff5a8ac00182b..9d41975bde07bb 100755
> --- a/tests/generic/757
> +++ b/tests/generic/757
> @@ -63,9 +63,14 @@ prev=$(_log_writes_mark_to_entry_number mkfs)
>  cur=$(_log_writes_find_next_fua $prev)
>  [ -z "$cur" ] && _fail "failed to locate next FUA write"
>  
> -while [ ! -z "$cur" ]; do
> +for ((i = 0; i < 100; i++)); do
>  	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
>  
> +	# xfs_repair won't run if the log is dirty
> +	if [ $FSTYP = "xfs" ]; then
> +		_scratch_mount

Hi Darrick, can you mount at here? I always get mount error as below:

SECTION       -- default
FSTYP         -- xfs (non-debug)
PLATFORM      -- Linux/x86_64 dell-per750-41 6.12.0-0.rc5.44.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Oct 28 14:12:55 UTC 2024
MKFS_OPTIONS  -- -f /dev/sda6
MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch

generic/757 2185s ... [failed, exit status 1]- output mismatch (see /root/git/xfstests/results//default/generic/757.out.bad)
    --- tests/generic/757.out   2024-10-27 03:09:48.740518275 +0800
    +++ /root/git/xfstests/results//default/generic/757.out.bad 2024-11-14 13:18:56.965210155 +0800
    @@ -1,2 +1,5 @@
     QA output created by 757
    -Silence is golden
    +mount: /mnt/scratch: cannot mount; probably corrupted filesystem on /dev/sda6.
    +       dmesg(1) may have more information after failed mount system call.
    +mount -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch failed
    +(see /root/git/xfstests/results//default/generic/757.full for details)
    ...
    (Run 'diff -u /root/git/xfstests/tests/generic/757.out /root/git/xfstests/results//default/generic/757.out.bad'  to see the entire diff)
Ran: generic/757
Failures: generic/757
Failed 1 of 1 tests

# dmesg
...
[1258572.169378] XFS (sda6): Mounting V5 Filesystem a0bf3918-1b66-4973-b03c-afd5197a6d21
[1258572.193037] XFS (sda6): Starting recovery (logdev: internal)
[1258572.201691] XFS (sda6): Corruption warning: Metadata has LSN (1:41116) ahead of current LSN (1:161). Please unmount and run xfs_repair (>= v4.3) to resolve.
[1258572.215850] XFS (sda6): Metadata CRC error detected at xfs_bmbt_read_verify+0x16/0xc0 [xfs], xfs_bmbt block 0x2000e8 
[1258572.226825] XFS (sda6): Unmount and run xfs_repair
[1258572.231796] XFS (sda6): First 128 bytes of corrupted metadata buffer:
[1258572.238411] 00000000: 42 4d 41 33 00 00 00 fb 00 00 00 00 00 04 00 9e  BMA3............
[1258572.246585] 00000010: 00 00 00 00 00 04 00 60 00 00 00 00 00 20 00 e8  .......`..... ..
[1258572.254766] 00000020: 00 00 00 01 00 00 a0 9c a0 bf 39 18 1b 66 49 73  ..........9..fIs
[1258572.262945] 00000030: b0 3c af d5 19 7a 6d 21 00 00 00 00 00 00 00 83  .<...zm!........
[1258572.271117] 00000040: 17 2f 1b e4 00 00 00 00 00 00 00 00 04 b1 2e 00  ./..............
[1258572.279291] 00000050: 00 00 00 4b 15 e0 00 01 80 00 00 00 04 b1 30 00  ...K..........0.
[1258572.287462] 00000060: 00 00 00 4b 16 00 00 4f 00 00 00 00 04 b1 ce 00  ...K...O........
[1258572.295635] 00000070: 00 00 00 4b 1f e0 00 01 80 00 00 00 04 b1 d0 00  ...K............
[1258572.303811] XFS (sda6): Filesystem has been shut down due to log error (0x2).
[1258572.311123] XFS (sda6): Please unmount the filesystem and rectify the problem(s).
[1258572.318791] XFS (sda6): log mount/recovery failed: error -74
[1258572.324798] XFS (sda6): log mount failed
[1258572.365169] XFS (sda5): Unmounting Filesystem eb4b7840-2c01-4306-9a6c-af2e7207a23f

> +		_scratch_unmount
> +	fi


>  	_check_scratch_fs
>  
>  	prev=$cur
>
Darrick J. Wong Nov. 14, 2024, 5:30 a.m. UTC | #3
On Thu, Nov 14, 2024 at 01:23:28PM +0800, Zorro Lang wrote:
> On Tue, Nov 12, 2024 at 05:37:29PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Fix this test so the check doesn't fail on XFS, and restrict runtime to
> > 100 loops because otherwise this test takes many hours.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  tests/generic/757 |    7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > 
> > diff --git a/tests/generic/757 b/tests/generic/757
> > index 0ff5a8ac00182b..9d41975bde07bb 100755
> > --- a/tests/generic/757
> > +++ b/tests/generic/757
> > @@ -63,9 +63,14 @@ prev=$(_log_writes_mark_to_entry_number mkfs)
> >  cur=$(_log_writes_find_next_fua $prev)
> >  [ -z "$cur" ] && _fail "failed to locate next FUA write"
> >  
> > -while [ ! -z "$cur" ]; do
> > +for ((i = 0; i < 100; i++)); do
> >  	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
> >  
> > +	# xfs_repair won't run if the log is dirty
> > +	if [ $FSTYP = "xfs" ]; then
> > +		_scratch_mount
> 
> Hi Darrick, can you mount at here? I always get mount error as below:
> 
> SECTION       -- default
> FSTYP         -- xfs (non-debug)
> PLATFORM      -- Linux/x86_64 dell-per750-41 6.12.0-0.rc5.44.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Oct 28 14:12:55 UTC 2024
> MKFS_OPTIONS  -- -f /dev/sda6
> MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch
> 
> generic/757 2185s ... [failed, exit status 1]- output mismatch (see /root/git/xfstests/results//default/generic/757.out.bad)
>     --- tests/generic/757.out   2024-10-27 03:09:48.740518275 +0800
>     +++ /root/git/xfstests/results//default/generic/757.out.bad 2024-11-14 13:18:56.965210155 +0800
>     @@ -1,2 +1,5 @@
>      QA output created by 757
>     -Silence is golden
>     +mount: /mnt/scratch: cannot mount; probably corrupted filesystem on /dev/sda6.
>     +       dmesg(1) may have more information after failed mount system call.
>     +mount -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch failed
>     +(see /root/git/xfstests/results//default/generic/757.full for details)
>     ...
>     (Run 'diff -u /root/git/xfstests/tests/generic/757.out /root/git/xfstests/results//default/generic/757.out.bad'  to see the entire diff)
> Ran: generic/757
> Failures: generic/757
> Failed 1 of 1 tests
> 
> # dmesg
> ...
> [1258572.169378] XFS (sda6): Mounting V5 Filesystem a0bf3918-1b66-4973-b03c-afd5197a6d21
> [1258572.193037] XFS (sda6): Starting recovery (logdev: internal)
> [1258572.201691] XFS (sda6): Corruption warning: Metadata has LSN (1:41116) ahead of current LSN (1:161). Please unmount and run xfs_repair (>= v4.3) to resolve.
> [1258572.215850] XFS (sda6): Metadata CRC error detected at xfs_bmbt_read_verify+0x16/0xc0 [xfs], xfs_bmbt block 0x2000e8 
> [1258572.226825] XFS (sda6): Unmount and run xfs_repair
> [1258572.231796] XFS (sda6): First 128 bytes of corrupted metadata buffer:
> [1258572.238411] 00000000: 42 4d 41 33 00 00 00 fb 00 00 00 00 00 04 00 9e  BMA3............
> [1258572.246585] 00000010: 00 00 00 00 00 04 00 60 00 00 00 00 00 20 00 e8  .......`..... ..
> [1258572.254766] 00000020: 00 00 00 01 00 00 a0 9c a0 bf 39 18 1b 66 49 73  ..........9..fIs
> [1258572.262945] 00000030: b0 3c af d5 19 7a 6d 21 00 00 00 00 00 00 00 83  .<...zm!........
> [1258572.271117] 00000040: 17 2f 1b e4 00 00 00 00 00 00 00 00 04 b1 2e 00  ./..............
> [1258572.279291] 00000050: 00 00 00 4b 15 e0 00 01 80 00 00 00 04 b1 30 00  ...K..........0.
> [1258572.287462] 00000060: 00 00 00 4b 16 00 00 4f 00 00 00 00 04 b1 ce 00  ...K...O........
> [1258572.295635] 00000070: 00 00 00 4b 1f e0 00 01 80 00 00 00 04 b1 d0 00  ...K............
> [1258572.303811] XFS (sda6): Filesystem has been shut down due to log error (0x2).
> [1258572.311123] XFS (sda6): Please unmount the filesystem and rectify the problem(s).
> [1258572.318791] XFS (sda6): log mount/recovery failed: error -74
> [1258572.324798] XFS (sda6): log mount failed
> [1258572.365169] XFS (sda5): Unmounting Filesystem eb4b7840-2c01-4306-9a6c-af2e7207a23f

I see periodic corruption messages, but generally the mount succeeds and
the test passes, even with TOT -rc6.

--D

> > +		_scratch_unmount
> > +	fi
> 
> 
> >  	_check_scratch_fs
> >  
> >  	prev=$cur
> > 
> 
>
Zorro Lang Nov. 15, 2024, 5:42 a.m. UTC | #4
On Wed, Nov 13, 2024 at 09:30:19PM -0800, Darrick J. Wong wrote:
> On Thu, Nov 14, 2024 at 01:23:28PM +0800, Zorro Lang wrote:
> > On Tue, Nov 12, 2024 at 05:37:29PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > > 
> > > Fix this test so the check doesn't fail on XFS, and restrict runtime to
> > > 100 loops because otherwise this test takes many hours.
> > > 
> > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > ---
> > >  tests/generic/757 |    7 ++++++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > 
> > > 
> > > diff --git a/tests/generic/757 b/tests/generic/757
> > > index 0ff5a8ac00182b..9d41975bde07bb 100755
> > > --- a/tests/generic/757
> > > +++ b/tests/generic/757
> > > @@ -63,9 +63,14 @@ prev=$(_log_writes_mark_to_entry_number mkfs)
> > >  cur=$(_log_writes_find_next_fua $prev)
> > >  [ -z "$cur" ] && _fail "failed to locate next FUA write"
> > >  
> > > -while [ ! -z "$cur" ]; do
> > > +for ((i = 0; i < 100; i++)); do
> > >  	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
> > >  
> > > +	# xfs_repair won't run if the log is dirty
> > > +	if [ $FSTYP = "xfs" ]; then
> > > +		_scratch_mount
> > 
> > Hi Darrick, can you mount at here? I always get mount error as below:
> > 
> > SECTION       -- default
> > FSTYP         -- xfs (non-debug)
> > PLATFORM      -- Linux/x86_64 dell-per750-41 6.12.0-0.rc5.44.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Oct 28 14:12:55 UTC 2024
> > MKFS_OPTIONS  -- -f /dev/sda6
> > MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch
> > 
> > generic/757 2185s ... [failed, exit status 1]- output mismatch (see /root/git/xfstests/results//default/generic/757.out.bad)
> >     --- tests/generic/757.out   2024-10-27 03:09:48.740518275 +0800
> >     +++ /root/git/xfstests/results//default/generic/757.out.bad 2024-11-14 13:18:56.965210155 +0800
> >     @@ -1,2 +1,5 @@
> >      QA output created by 757
> >     -Silence is golden
> >     +mount: /mnt/scratch: cannot mount; probably corrupted filesystem on /dev/sda6.
> >     +       dmesg(1) may have more information after failed mount system call.
> >     +mount -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch failed
> >     +(see /root/git/xfstests/results//default/generic/757.full for details)
> >     ...
> >     (Run 'diff -u /root/git/xfstests/tests/generic/757.out /root/git/xfstests/results//default/generic/757.out.bad'  to see the entire diff)
> > Ran: generic/757
> > Failures: generic/757
> > Failed 1 of 1 tests
> > 
> > # dmesg
> > ...
> > [1258572.169378] XFS (sda6): Mounting V5 Filesystem a0bf3918-1b66-4973-b03c-afd5197a6d21
> > [1258572.193037] XFS (sda6): Starting recovery (logdev: internal)
> > [1258572.201691] XFS (sda6): Corruption warning: Metadata has LSN (1:41116) ahead of current LSN (1:161). Please unmount and run xfs_repair (>= v4.3) to resolve.
> > [1258572.215850] XFS (sda6): Metadata CRC error detected at xfs_bmbt_read_verify+0x16/0xc0 [xfs], xfs_bmbt block 0x2000e8 
> > [1258572.226825] XFS (sda6): Unmount and run xfs_repair
> > [1258572.231796] XFS (sda6): First 128 bytes of corrupted metadata buffer:
> > [1258572.238411] 00000000: 42 4d 41 33 00 00 00 fb 00 00 00 00 00 04 00 9e  BMA3............
> > [1258572.246585] 00000010: 00 00 00 00 00 04 00 60 00 00 00 00 00 20 00 e8  .......`..... ..
> > [1258572.254766] 00000020: 00 00 00 01 00 00 a0 9c a0 bf 39 18 1b 66 49 73  ..........9..fIs
> > [1258572.262945] 00000030: b0 3c af d5 19 7a 6d 21 00 00 00 00 00 00 00 83  .<...zm!........
> > [1258572.271117] 00000040: 17 2f 1b e4 00 00 00 00 00 00 00 00 04 b1 2e 00  ./..............
> > [1258572.279291] 00000050: 00 00 00 4b 15 e0 00 01 80 00 00 00 04 b1 30 00  ...K..........0.
> > [1258572.287462] 00000060: 00 00 00 4b 16 00 00 4f 00 00 00 00 04 b1 ce 00  ...K...O........
> > [1258572.295635] 00000070: 00 00 00 4b 1f e0 00 01 80 00 00 00 04 b1 d0 00  ...K............
> > [1258572.303811] XFS (sda6): Filesystem has been shut down due to log error (0x2).
> > [1258572.311123] XFS (sda6): Please unmount the filesystem and rectify the problem(s).
> > [1258572.318791] XFS (sda6): log mount/recovery failed: error -74
> > [1258572.324798] XFS (sda6): log mount failed
> > [1258572.365169] XFS (sda5): Unmounting Filesystem eb4b7840-2c01-4306-9a6c-af2e7207a23f
> 
> I see periodic corruption messages, but generally the mount succeeds and
> the test passes, even with TOT -rc6.

Still fails on -rc7+ [1]. Even with `xfs_repair $SCRATCH_DEV` before mount, it still fails [2].
But `xfs_repair -L` helps, the test can keep running after that.

Do you think it's a xfs issue, or a case issue (xfs need a log cleanup at here?).

Thanks,
Zorro

[1]
# ./check -s default generic/757
SECTION       -- default
FSTYP         -- xfs (non-debug)
PLATFORM      -- Linux/x86_64 dell-per750-41 6.12.0-0.rc7.58.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Nov 11 15:23:45 UTC 2024
MKFS_OPTIONS  -- -f /dev/sda6
MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch

generic/757 2185s ... [failed, exit status 1]- output mismatch (see /root/git/xfstests/results//default/generic/757.out.bad)
    --- tests/generic/757.out   2024-10-27 03:09:48.740518275 +0800
    +++ /root/git/xfstests/results//default/generic/757.out.bad 2024-11-15 03:06:59.462739215 +0800
    @@ -1,2 +1,5 @@
     QA output created by 757
    -Silence is golden
    +mount: /mnt/scratch: cannot mount; probably corrupted filesystem on /dev/sda6.
    +       dmesg(1) may have more information after failed mount system call.
    +mount -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch failed
    +(see /root/git/xfstests/results//default/generic/757.full for details)
    ...
    (Run 'diff -u /root/git/xfstests/tests/generic/757.out /root/git/xfstests/results//default/generic/757.out.bad'  to see the entire diff)
Ran: generic/757
Failures: generic/757
Failed 1 of 1 tests

[2]
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

> 
> --D
> 
> > > +		_scratch_unmount
> > > +	fi
> > 
> > 
> > >  	_check_scratch_fs
> > >  
> > >  	prev=$cur
> > > 
> > 
> > 
>
Darrick J. Wong Nov. 15, 2024, 5:30 p.m. UTC | #5
On Fri, Nov 15, 2024 at 01:42:51PM +0800, Zorro Lang wrote:
> On Wed, Nov 13, 2024 at 09:30:19PM -0800, Darrick J. Wong wrote:
> > On Thu, Nov 14, 2024 at 01:23:28PM +0800, Zorro Lang wrote:
> > > On Tue, Nov 12, 2024 at 05:37:29PM -0800, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <djwong@kernel.org>
> > > > 
> > > > Fix this test so the check doesn't fail on XFS, and restrict runtime to
> > > > 100 loops because otherwise this test takes many hours.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > > ---
> > > >  tests/generic/757 |    7 ++++++-
> > > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > > 
> > > > 
> > > > diff --git a/tests/generic/757 b/tests/generic/757
> > > > index 0ff5a8ac00182b..9d41975bde07bb 100755
> > > > --- a/tests/generic/757
> > > > +++ b/tests/generic/757
> > > > @@ -63,9 +63,14 @@ prev=$(_log_writes_mark_to_entry_number mkfs)
> > > >  cur=$(_log_writes_find_next_fua $prev)
> > > >  [ -z "$cur" ] && _fail "failed to locate next FUA write"
> > > >  
> > > > -while [ ! -z "$cur" ]; do
> > > > +for ((i = 0; i < 100; i++)); do
> > > >  	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
> > > >  
> > > > +	# xfs_repair won't run if the log is dirty
> > > > +	if [ $FSTYP = "xfs" ]; then
> > > > +		_scratch_mount
> > > 
> > > Hi Darrick, can you mount at here? I always get mount error as below:
> > > 
> > > SECTION       -- default
> > > FSTYP         -- xfs (non-debug)
> > > PLATFORM      -- Linux/x86_64 dell-per750-41 6.12.0-0.rc5.44.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Oct 28 14:12:55 UTC 2024
> > > MKFS_OPTIONS  -- -f /dev/sda6
> > > MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch
> > > 
> > > generic/757 2185s ... [failed, exit status 1]- output mismatch (see /root/git/xfstests/results//default/generic/757.out.bad)
> > >     --- tests/generic/757.out   2024-10-27 03:09:48.740518275 +0800
> > >     +++ /root/git/xfstests/results//default/generic/757.out.bad 2024-11-14 13:18:56.965210155 +0800
> > >     @@ -1,2 +1,5 @@
> > >      QA output created by 757
> > >     -Silence is golden
> > >     +mount: /mnt/scratch: cannot mount; probably corrupted filesystem on /dev/sda6.
> > >     +       dmesg(1) may have more information after failed mount system call.
> > >     +mount -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch failed
> > >     +(see /root/git/xfstests/results//default/generic/757.full for details)
> > >     ...
> > >     (Run 'diff -u /root/git/xfstests/tests/generic/757.out /root/git/xfstests/results//default/generic/757.out.bad'  to see the entire diff)
> > > Ran: generic/757
> > > Failures: generic/757
> > > Failed 1 of 1 tests
> > > 
> > > # dmesg
> > > ...
> > > [1258572.169378] XFS (sda6): Mounting V5 Filesystem a0bf3918-1b66-4973-b03c-afd5197a6d21
> > > [1258572.193037] XFS (sda6): Starting recovery (logdev: internal)
> > > [1258572.201691] XFS (sda6): Corruption warning: Metadata has LSN (1:41116) ahead of current LSN (1:161). Please unmount and run xfs_repair (>= v4.3) to resolve.
> > > [1258572.215850] XFS (sda6): Metadata CRC error detected at xfs_bmbt_read_verify+0x16/0xc0 [xfs], xfs_bmbt block 0x2000e8 
> > > [1258572.226825] XFS (sda6): Unmount and run xfs_repair
> > > [1258572.231796] XFS (sda6): First 128 bytes of corrupted metadata buffer:
> > > [1258572.238411] 00000000: 42 4d 41 33 00 00 00 fb 00 00 00 00 00 04 00 9e  BMA3............
> > > [1258572.246585] 00000010: 00 00 00 00 00 04 00 60 00 00 00 00 00 20 00 e8  .......`..... ..
> > > [1258572.254766] 00000020: 00 00 00 01 00 00 a0 9c a0 bf 39 18 1b 66 49 73  ..........9..fIs
> > > [1258572.262945] 00000030: b0 3c af d5 19 7a 6d 21 00 00 00 00 00 00 00 83  .<...zm!........
> > > [1258572.271117] 00000040: 17 2f 1b e4 00 00 00 00 00 00 00 00 04 b1 2e 00  ./..............
> > > [1258572.279291] 00000050: 00 00 00 4b 15 e0 00 01 80 00 00 00 04 b1 30 00  ...K..........0.
> > > [1258572.287462] 00000060: 00 00 00 4b 16 00 00 4f 00 00 00 00 04 b1 ce 00  ...K...O........
> > > [1258572.295635] 00000070: 00 00 00 4b 1f e0 00 01 80 00 00 00 04 b1 d0 00  ...K............
> > > [1258572.303811] XFS (sda6): Filesystem has been shut down due to log error (0x2).
> > > [1258572.311123] XFS (sda6): Please unmount the filesystem and rectify the problem(s).
> > > [1258572.318791] XFS (sda6): log mount/recovery failed: error -74
> > > [1258572.324798] XFS (sda6): log mount failed
> > > [1258572.365169] XFS (sda5): Unmounting Filesystem eb4b7840-2c01-4306-9a6c-af2e7207a23f
> > 
> > I see periodic corruption messages, but generally the mount succeeds and
> > the test passes, even with TOT -rc6.
> 
> Still fails on -rc7+ [1]. Even with `xfs_repair $SCRATCH_DEV` before mount, it still fails [2].
> But `xfs_repair -L` helps, the test can keep running after that.
> 
> Do you think it's a xfs issue, or a case issue (xfs need a log cleanup at here?).

I'm not sure.  Does your sda6 device support discards?  My VMs'
SCRATCH_DEVs usually support it, and I noticed that all the other
generic/ _log_writes_init tests set up a dm-thin volume so that the
replays can always zero out the whole device before jumping to a
snapshot.

--D

> Thanks,
> Zorro
> 
> [1]
> # ./check -s default generic/757
> SECTION       -- default
> FSTYP         -- xfs (non-debug)
> PLATFORM      -- Linux/x86_64 dell-per750-41 6.12.0-0.rc7.58.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Nov 11 15:23:45 UTC 2024
> MKFS_OPTIONS  -- -f /dev/sda6
> MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch
> 
> generic/757 2185s ... [failed, exit status 1]- output mismatch (see /root/git/xfstests/results//default/generic/757.out.bad)
>     --- tests/generic/757.out   2024-10-27 03:09:48.740518275 +0800
>     +++ /root/git/xfstests/results//default/generic/757.out.bad 2024-11-15 03:06:59.462739215 +0800
>     @@ -1,2 +1,5 @@
>      QA output created by 757
>     -Silence is golden
>     +mount: /mnt/scratch: cannot mount; probably corrupted filesystem on /dev/sda6.
>     +       dmesg(1) may have more information after failed mount system call.
>     +mount -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch failed
>     +(see /root/git/xfstests/results//default/generic/757.full for details)
>     ...
>     (Run 'diff -u /root/git/xfstests/tests/generic/757.out /root/git/xfstests/results//default/generic/757.out.bad'  to see the entire diff)
> Ran: generic/757
> Failures: generic/757
> Failed 1 of 1 tests
> 
> [2]
> Phase 1 - find and verify superblock...
> Phase 2 - using internal log
>         - zero log...
> ERROR: The filesystem has valuable metadata changes in a log which needs to
> be replayed.  Mount the filesystem to replay the log, and unmount it before
> re-running xfs_repair.  If you are unable to mount the filesystem, then use
> the -L option to destroy the log and attempt a repair.
> Note that destroying the log may cause corruption -- please attempt a mount
> of the filesystem before doing this.
> 
> > 
> > --D
> > 
> > > > +		_scratch_unmount
> > > > +	fi
> > > 
> > > 
> > > >  	_check_scratch_fs
> > > >  
> > > >  	prev=$cur
> > > > 
> > > 
> > > 
> > 
> 
>
Zorro Lang Nov. 15, 2024, 6:28 p.m. UTC | #6
On Fri, Nov 15, 2024 at 09:30:27AM -0800, Darrick J. Wong wrote:
> On Fri, Nov 15, 2024 at 01:42:51PM +0800, Zorro Lang wrote:
> > On Wed, Nov 13, 2024 at 09:30:19PM -0800, Darrick J. Wong wrote:
> > > On Thu, Nov 14, 2024 at 01:23:28PM +0800, Zorro Lang wrote:
> > > > On Tue, Nov 12, 2024 at 05:37:29PM -0800, Darrick J. Wong wrote:
> > > > > From: Darrick J. Wong <djwong@kernel.org>
> > > > > 
> > > > > Fix this test so the check doesn't fail on XFS, and restrict runtime to
> > > > > 100 loops because otherwise this test takes many hours.
> > > > > 
> > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > > > ---
> > > > >  tests/generic/757 |    7 ++++++-
> > > > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > > > 
> > > > > 
> > > > > diff --git a/tests/generic/757 b/tests/generic/757
> > > > > index 0ff5a8ac00182b..9d41975bde07bb 100755
> > > > > --- a/tests/generic/757
> > > > > +++ b/tests/generic/757
> > > > > @@ -63,9 +63,14 @@ prev=$(_log_writes_mark_to_entry_number mkfs)
> > > > >  cur=$(_log_writes_find_next_fua $prev)
> > > > >  [ -z "$cur" ] && _fail "failed to locate next FUA write"
> > > > >  
> > > > > -while [ ! -z "$cur" ]; do
> > > > > +for ((i = 0; i < 100; i++)); do
> > > > >  	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
> > > > >  
> > > > > +	# xfs_repair won't run if the log is dirty
> > > > > +	if [ $FSTYP = "xfs" ]; then
> > > > > +		_scratch_mount
> > > > 
> > > > Hi Darrick, can you mount at here? I always get mount error as below:
> > > > 
> > > > SECTION       -- default
> > > > FSTYP         -- xfs (non-debug)
> > > > PLATFORM      -- Linux/x86_64 dell-per750-41 6.12.0-0.rc5.44.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Oct 28 14:12:55 UTC 2024
> > > > MKFS_OPTIONS  -- -f /dev/sda6
> > > > MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch
> > > > 
> > > > generic/757 2185s ... [failed, exit status 1]- output mismatch (see /root/git/xfstests/results//default/generic/757.out.bad)
> > > >     --- tests/generic/757.out   2024-10-27 03:09:48.740518275 +0800
> > > >     +++ /root/git/xfstests/results//default/generic/757.out.bad 2024-11-14 13:18:56.965210155 +0800
> > > >     @@ -1,2 +1,5 @@
> > > >      QA output created by 757
> > > >     -Silence is golden
> > > >     +mount: /mnt/scratch: cannot mount; probably corrupted filesystem on /dev/sda6.
> > > >     +       dmesg(1) may have more information after failed mount system call.
> > > >     +mount -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch failed
> > > >     +(see /root/git/xfstests/results//default/generic/757.full for details)
> > > >     ...
> > > >     (Run 'diff -u /root/git/xfstests/tests/generic/757.out /root/git/xfstests/results//default/generic/757.out.bad'  to see the entire diff)
> > > > Ran: generic/757
> > > > Failures: generic/757
> > > > Failed 1 of 1 tests
> > > > 
> > > > # dmesg
> > > > ...
> > > > [1258572.169378] XFS (sda6): Mounting V5 Filesystem a0bf3918-1b66-4973-b03c-afd5197a6d21
> > > > [1258572.193037] XFS (sda6): Starting recovery (logdev: internal)
> > > > [1258572.201691] XFS (sda6): Corruption warning: Metadata has LSN (1:41116) ahead of current LSN (1:161). Please unmount and run xfs_repair (>= v4.3) to resolve.
> > > > [1258572.215850] XFS (sda6): Metadata CRC error detected at xfs_bmbt_read_verify+0x16/0xc0 [xfs], xfs_bmbt block 0x2000e8 
> > > > [1258572.226825] XFS (sda6): Unmount and run xfs_repair
> > > > [1258572.231796] XFS (sda6): First 128 bytes of corrupted metadata buffer:
> > > > [1258572.238411] 00000000: 42 4d 41 33 00 00 00 fb 00 00 00 00 00 04 00 9e  BMA3............
> > > > [1258572.246585] 00000010: 00 00 00 00 00 04 00 60 00 00 00 00 00 20 00 e8  .......`..... ..
> > > > [1258572.254766] 00000020: 00 00 00 01 00 00 a0 9c a0 bf 39 18 1b 66 49 73  ..........9..fIs
> > > > [1258572.262945] 00000030: b0 3c af d5 19 7a 6d 21 00 00 00 00 00 00 00 83  .<...zm!........
> > > > [1258572.271117] 00000040: 17 2f 1b e4 00 00 00 00 00 00 00 00 04 b1 2e 00  ./..............
> > > > [1258572.279291] 00000050: 00 00 00 4b 15 e0 00 01 80 00 00 00 04 b1 30 00  ...K..........0.
> > > > [1258572.287462] 00000060: 00 00 00 4b 16 00 00 4f 00 00 00 00 04 b1 ce 00  ...K...O........
> > > > [1258572.295635] 00000070: 00 00 00 4b 1f e0 00 01 80 00 00 00 04 b1 d0 00  ...K............
> > > > [1258572.303811] XFS (sda6): Filesystem has been shut down due to log error (0x2).
> > > > [1258572.311123] XFS (sda6): Please unmount the filesystem and rectify the problem(s).
> > > > [1258572.318791] XFS (sda6): log mount/recovery failed: error -74
> > > > [1258572.324798] XFS (sda6): log mount failed
> > > > [1258572.365169] XFS (sda5): Unmounting Filesystem eb4b7840-2c01-4306-9a6c-af2e7207a23f
> > > 
> > > I see periodic corruption messages, but generally the mount succeeds and
> > > the test passes, even with TOT -rc6.
> > 
> > Still fails on -rc7+ [1]. Even with `xfs_repair $SCRATCH_DEV` before mount, it still fails [2].
> > But `xfs_repair -L` helps, the test can keep running after that.
> > 
> > Do you think it's a xfs issue, or a case issue (xfs need a log cleanup at here?).
> 
> I'm not sure.  Does your sda6 device support discards?  My VMs'
> SCRATCH_DEVs usually support it, and I noticed that all the other
> generic/ _log_writes_init tests set up a dm-thin volume so that the
> replays can always zero out the whole device before jumping to a
> snapshot.

No, it doesn't support discard, but it's multi-scripted:

# mkfs.xfs -f /dev/sda6
meta-data=/dev/sda6              isize=512    agcount=25, agsize=1064176 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
         =                       exchange=0  
data     =                       bsize=4096   blocks=26604400, imaxpct=25
         =                       sunit=16     swidth=32 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=0
log      =internal log           bsize=4096   blocks=179552, version=2
         =                       sectsz=512   sunit=16 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

But you remind me, I remember Brian did below changes for dmlogwrites test.

fc5870da4 generic/470: use thin volume for dmlogwrites target device
3713a3b37 generic/457: use thin volume for dmlogwrites target device
96bcbcabd generic/455: use thin volume for dmlogwrites target device

commit 96bcbcabd0f34dcd57f9349c8eea09523d69a817
Author: Brian Foster <bfoster@redhat.com>
Date:   Tue Sep 1 09:47:26 2020 -0400

    generic/455: use thin volume for dmlogwrites target device
    
    dmlogwrites support for XFS depends on discard zeroing support of
    the intended target device. Update the test to use a thin volume and
    allow it to run consistently and reliably on XFS.

Thanks,
Zorro

> 
> --D
> 
> > Thanks,
> > Zorro
> > 
> > [1]
> > # ./check -s default generic/757
> > SECTION       -- default
> > FSTYP         -- xfs (non-debug)
> > PLATFORM      -- Linux/x86_64 dell-per750-41 6.12.0-0.rc7.58.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Nov 11 15:23:45 UTC 2024
> > MKFS_OPTIONS  -- -f /dev/sda6
> > MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch
> > 
> > generic/757 2185s ... [failed, exit status 1]- output mismatch (see /root/git/xfstests/results//default/generic/757.out.bad)
> >     --- tests/generic/757.out   2024-10-27 03:09:48.740518275 +0800
> >     +++ /root/git/xfstests/results//default/generic/757.out.bad 2024-11-15 03:06:59.462739215 +0800
> >     @@ -1,2 +1,5 @@
> >      QA output created by 757
> >     -Silence is golden
> >     +mount: /mnt/scratch: cannot mount; probably corrupted filesystem on /dev/sda6.
> >     +       dmesg(1) may have more information after failed mount system call.
> >     +mount -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/scratch failed
> >     +(see /root/git/xfstests/results//default/generic/757.full for details)
> >     ...
> >     (Run 'diff -u /root/git/xfstests/tests/generic/757.out /root/git/xfstests/results//default/generic/757.out.bad'  to see the entire diff)
> > Ran: generic/757
> > Failures: generic/757
> > Failed 1 of 1 tests
> > 
> > [2]
> > Phase 1 - find and verify superblock...
> > Phase 2 - using internal log
> >         - zero log...
> > ERROR: The filesystem has valuable metadata changes in a log which needs to
> > be replayed.  Mount the filesystem to replay the log, and unmount it before
> > re-running xfs_repair.  If you are unable to mount the filesystem, then use
> > the -L option to destroy the log and attempt a repair.
> > Note that destroying the log may cause corruption -- please attempt a mount
> > of the filesystem before doing this.
> > 
> > > 
> > > --D
> > > 
> > > > > +		_scratch_unmount
> > > > > +	fi
> > > > 
> > > > 
> > > > >  	_check_scratch_fs
> > > > >  
> > > > >  	prev=$cur
> > > > > 
> > > > 
> > > > 
> > > 
> > 
> > 
>
diff mbox series

Patch

diff --git a/tests/generic/757 b/tests/generic/757
index 0ff5a8ac00182b..9d41975bde07bb 100755
--- a/tests/generic/757
+++ b/tests/generic/757
@@ -63,9 +63,14 @@  prev=$(_log_writes_mark_to_entry_number mkfs)
 cur=$(_log_writes_find_next_fua $prev)
 [ -z "$cur" ] && _fail "failed to locate next FUA write"
 
-while [ ! -z "$cur" ]; do
+for ((i = 0; i < 100; i++)); do
 	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
 
+	# xfs_repair won't run if the log is dirty
+	if [ $FSTYP = "xfs" ]; then
+		_scratch_mount
+		_scratch_unmount
+	fi
 	_check_scratch_fs
 
 	prev=$cur