Message ID | 20240910043127.3480554-1-hch@lst.de (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Series | xfs: test log recovery for extent frees right after growfs | expand |
On Tue, Sep 10, 2024 at 07:31:17AM +0300, Christoph Hellwig wrote: > Reproduce a bug where log recovery fails when an unfinised extent free > intent is in the same log as the growfs transaction that added the AG. Which bug? If it's a regression test, can we have a _fixed_by_kernel_commit to mark the known issue? > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > tests/xfs/1323 | 61 ++++++++++++++++++++++++++++++++++++++++++++++ > tests/xfs/1323.out | 14 +++++++++++ > 2 files changed, 75 insertions(+) > create mode 100755 tests/xfs/1323 > create mode 100644 tests/xfs/1323.out > > diff --git a/tests/xfs/1323 b/tests/xfs/1323 > new file mode 100755 > index 000000000..a436510b0 > --- /dev/null > +++ b/tests/xfs/1323 > @@ -0,0 +1,61 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2024, Christoph Hellwig > +# > +# FS QA Test No. 1323 > +# > +# Test that recovering an extfree item residing on a freshly grown AG works. > +# > +. ./common/preamble > +_begin_fstest auto quick growfs > + > +. ./common/filter > +. ./common/inject > + _require_scratch > +_require_xfs_io_error_injection "free_extent" > + > +_xfs_force_bdev data $SCRATCH_MNT Don't you need to do this after below _scratch_mount ? > + > +_cleanup() > +{ > + cd / > + _scratch_unmount > /dev/null 2>&1 SCRATCH_DEV will be unmounted at the end of each test, so this might not be needed. If so, this whole _cleanup is not necessary. > + rm -rf $tmp.* > +} > + > +echo "Format filesystem" > +_scratch_mkfs_sized $((128 * 1024 * 1024)) >> $seqres.full > +_scratch_mount >> $seqres.full > + > +echo "Fill file system" > +dd if=/dev/zero of=$SCRATCH_MNT/filler1 bs=64k oflag=direct &>/dev/null > +sync > +dd if=/dev/zero of=$SCRATCH_MNT/filler2 bs=64k oflag=direct &>/dev/null > +sync There's a helper named _fill_fs() in common/populate, I'm not sure if your above steps are necessary or can be replaced, just to confirm with you. > + > +echo "Grow file system" > +$XFS_GROWFS_PROG $SCRATCH_MNT >>$seqres.full _require_command "$XFS_GROWFS_PROG" xfs_growfs > + > +echo "Create test files" > +dd if=/dev/zero of=$SCRATCH_MNT/test1 bs=8M count=4 oflag=direct | \ > + _filter_dd > +dd if=/dev/zero of=$SCRATCH_MNT/test2 bs=8M count=4 oflag=direct | \ > + _filter_dd > + > +echo "Inject error" > +_scratch_inject_error "free_extent" > + > +echo "Remove test file" > +rm $SCRATCH_MNT/test2 Is -f needed ? Thanks, Zorro > + > +echo "FS should be shut down, touch will fail" > +touch $SCRATCH_MNT/test1 2>&1 | _filter_scratch > + > +echo "Remount to replay log" > +_scratch_remount_dump_log >> $seqres.full > + > +echo "Done" > + > +# success, all done > +status=0 > +exit > diff --git a/tests/xfs/1323.out b/tests/xfs/1323.out > new file mode 100644 > index 000000000..1740f9a1f > --- /dev/null > +++ b/tests/xfs/1323.out > @@ -0,0 +1,14 @@ > +QA output created by 1323 > +Format filesystem > +Fill file system > +Grow file system > +Create test files > +4+0 records in > +4+0 records out > +4+0 records in > +4+0 records out > +Inject error > +Remove test file > +FS should be shut down, touch will fail > +Remount to replay log > +Done > -- > 2.45.2 > >
On Tue, Sep 10, 2024 at 04:57:48PM +0800, Zorro Lang wrote: > On Tue, Sep 10, 2024 at 07:31:17AM +0300, Christoph Hellwig wrote: > > Reproduce a bug where log recovery fails when an unfinised extent free > > intent is in the same log as the growfs transaction that added the AG. > > Which bug? If it's a regression test, can we have a _fixed_by_kernel_commit > to mark the known issue? I just sent the kernel patches for it. It's been there basically forever as far as I can tell.
On Tue, Sep 10, 2024 at 07:31:17AM +0300, Christoph Hellwig wrote: > Reproduce a bug where log recovery fails when an unfinised extent free > intent is in the same log as the growfs transaction that added the AG. > No real issue with the test, but I wonder if we could do something more generic. Various XFS shutdown and log recovery issues went undetected for a while until we started adding more of the generic stress tests currently categorized in the recoveryloop group. So for example, I'm wondering if you took something like generic/388 or 475 and modified it to start with a smallish fs, grew it in 1GB or whatever increments on each loop iteration, and then ran the same generic stress/timeout/shutdown/recovery sequence, would that eventually reproduce the issue you've fixed? I don't think reproducibility would need to be 100% for the test to be useful, fwiw. Note that I'm assuming we don't have something like that already. I see growfs and shutdown tests in tests/xfs/group.list, but nothing in both groups and I haven't looked through the individual tests. Just a thought. Brian > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > tests/xfs/1323 | 61 ++++++++++++++++++++++++++++++++++++++++++++++ > tests/xfs/1323.out | 14 +++++++++++ > 2 files changed, 75 insertions(+) > create mode 100755 tests/xfs/1323 > create mode 100644 tests/xfs/1323.out > > diff --git a/tests/xfs/1323 b/tests/xfs/1323 > new file mode 100755 > index 000000000..a436510b0 > --- /dev/null > +++ b/tests/xfs/1323 > @@ -0,0 +1,61 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2024, Christoph Hellwig > +# > +# FS QA Test No. 1323 > +# > +# Test that recovering an extfree item residing on a freshly grown AG works. > +# > +. ./common/preamble > +_begin_fstest auto quick growfs > + > +. ./common/filter > +. ./common/inject > + > +_require_xfs_io_error_injection "free_extent" > + > +_xfs_force_bdev data $SCRATCH_MNT > + > +_cleanup() > +{ > + cd / > + _scratch_unmount > /dev/null 2>&1 > + rm -rf $tmp.* > +} > + > +echo "Format filesystem" > +_scratch_mkfs_sized $((128 * 1024 * 1024)) >> $seqres.full > +_scratch_mount >> $seqres.full > + > +echo "Fill file system" > +dd if=/dev/zero of=$SCRATCH_MNT/filler1 bs=64k oflag=direct &>/dev/null > +sync > +dd if=/dev/zero of=$SCRATCH_MNT/filler2 bs=64k oflag=direct &>/dev/null > +sync > + > +echo "Grow file system" > +$XFS_GROWFS_PROG $SCRATCH_MNT >>$seqres.full > + > +echo "Create test files" > +dd if=/dev/zero of=$SCRATCH_MNT/test1 bs=8M count=4 oflag=direct | \ > + _filter_dd > +dd if=/dev/zero of=$SCRATCH_MNT/test2 bs=8M count=4 oflag=direct | \ > + _filter_dd > + > +echo "Inject error" > +_scratch_inject_error "free_extent" > + > +echo "Remove test file" > +rm $SCRATCH_MNT/test2 > + > +echo "FS should be shut down, touch will fail" > +touch $SCRATCH_MNT/test1 2>&1 | _filter_scratch > + > +echo "Remount to replay log" > +_scratch_remount_dump_log >> $seqres.full > + > +echo "Done" > + > +# success, all done > +status=0 > +exit > diff --git a/tests/xfs/1323.out b/tests/xfs/1323.out > new file mode 100644 > index 000000000..1740f9a1f > --- /dev/null > +++ b/tests/xfs/1323.out > @@ -0,0 +1,14 @@ > +QA output created by 1323 > +Format filesystem > +Fill file system > +Grow file system > +Create test files > +4+0 records in > +4+0 records out > +4+0 records in > +4+0 records out > +Inject error > +Remove test file > +FS should be shut down, touch will fail > +Remount to replay log > +Done > -- > 2.45.2 > >
On Tue, Sep 10, 2024 at 10:19:50AM -0400, Brian Foster wrote: > No real issue with the test, but I wonder if we could do something more > generic. Various XFS shutdown and log recovery issues went undetected > for a while until we started adding more of the generic stress tests > currently categorized in the recoveryloop group. > > So for example, I'm wondering if you took something like generic/388 or > 475 and modified it to start with a smallish fs, grew it in 1GB or > whatever increments on each loop iteration, and then ran the same > generic stress/timeout/shutdown/recovery sequence, would that eventually > reproduce the issue you've fixed? I don't think reproducibility would > need to be 100% for the test to be useful, fwiw. > > Note that I'm assuming we don't have something like that already. I see > growfs and shutdown tests in tests/xfs/group.list, but nothing in both > groups and I haven't looked through the individual tests. Just a > thought. It turns out reproducing this bug was surprisingly complicated. After a growfs we can now dip into reserves that made the test1 file start filling up the existing AGs first for a while, and thus the error injection would hit on that and never even reach a new AG. So while agree with your sentiment and like the highlevel idea, I suspect it will need a fair amount of work to actually be useful. Right now I'm too busy with various projects to look into it unfortunately.
On Tue, Sep 10, 2024 at 05:10:53PM +0200, Christoph Hellwig wrote: > On Tue, Sep 10, 2024 at 10:19:50AM -0400, Brian Foster wrote: > > No real issue with the test, but I wonder if we could do something more > > generic. Various XFS shutdown and log recovery issues went undetected > > for a while until we started adding more of the generic stress tests > > currently categorized in the recoveryloop group. > > > > So for example, I'm wondering if you took something like generic/388 or > > 475 and modified it to start with a smallish fs, grew it in 1GB or > > whatever increments on each loop iteration, and then ran the same > > generic stress/timeout/shutdown/recovery sequence, would that eventually > > reproduce the issue you've fixed? I don't think reproducibility would > > need to be 100% for the test to be useful, fwiw. > > > > Note that I'm assuming we don't have something like that already. I see > > growfs and shutdown tests in tests/xfs/group.list, but nothing in both > > groups and I haven't looked through the individual tests. Just a > > thought. > > It turns out reproducing this bug was surprisingly complicated. > After a growfs we can now dip into reserves that made the test1 > file start filling up the existing AGs first for a while, and thus > the error injection would hit on that and never even reach a new > AG. > > So while agree with your sentiment and like the highlevel idea, I > suspect it will need a fair amount of work to actually be useful. > Right now I'm too busy with various projects to look into it > unfortunately. > Fair enough, maybe I'll play with it a bit when I have some more time. Brian
On Tue, Sep 10, 2024 at 12:13:29PM -0400, Brian Foster wrote: > On Tue, Sep 10, 2024 at 05:10:53PM +0200, Christoph Hellwig wrote: > > On Tue, Sep 10, 2024 at 10:19:50AM -0400, Brian Foster wrote: > > > No real issue with the test, but I wonder if we could do something more > > > generic. Various XFS shutdown and log recovery issues went undetected > > > for a while until we started adding more of the generic stress tests > > > currently categorized in the recoveryloop group. > > > > > > So for example, I'm wondering if you took something like generic/388 or > > > 475 and modified it to start with a smallish fs, grew it in 1GB or > > > whatever increments on each loop iteration, and then ran the same > > > generic stress/timeout/shutdown/recovery sequence, would that eventually > > > reproduce the issue you've fixed? I don't think reproducibility would > > > need to be 100% for the test to be useful, fwiw. > > > > > > Note that I'm assuming we don't have something like that already. I see > > > growfs and shutdown tests in tests/xfs/group.list, but nothing in both > > > groups and I haven't looked through the individual tests. Just a > > > thought. > > > > It turns out reproducing this bug was surprisingly complicated. > > After a growfs we can now dip into reserves that made the test1 > > file start filling up the existing AGs first for a while, and thus > > the error injection would hit on that and never even reach a new > > AG. > > > > So while agree with your sentiment and like the highlevel idea, I > > suspect it will need a fair amount of work to actually be useful. > > Right now I'm too busy with various projects to look into it > > unfortunately. > > > > Fair enough, maybe I'll play with it a bit when I have some more time. > > Brian > > FWIW, here's a quick hack at such a test. This is essentially a copy of xfs/104, tweaked to remove some of the output noise and whatnot, and hacked in some bits from generic/388 to do a shutdown and mount cycle per iteration. I'm not sure if this reproduces your original problem, but this blows up pretty quickly on 6.12.0-rc2. I see a stream of warnings that start like this (buffer readahead path via log recovery): [ 2807.764283] XFS (vdb2): xfs_buf_map_verify: daddr 0x3e803 out of range, EOFS 0x3e800 [ 2807.768094] ------------[ cut here ]------------ [ 2807.770629] WARNING: CPU: 0 PID: 28386 at fs/xfs/xfs_buf.c:553 xfs_buf_get_map+0x184e/0x2670 [xfs] ... and then end up with an unrecoverable/unmountable fs. From the title it sounds like this may be a different issue though.. hm? Brian --- 8< --- diff --git a/tests/xfs/609 b/tests/xfs/609 new file mode 100755 index 00000000..b9c23869 --- /dev/null +++ b/tests/xfs/609 @@ -0,0 +1,100 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2000-2004 Silicon Graphics, Inc. All Rights Reserved. +# +# FS QA Test No. 609 +# +# XFS online growfs-while-allocating tests (data subvol variant) +# +. ./common/preamble +_begin_fstest growfs ioctl prealloc auto stress + +# Import common functions. +. ./common/filter + +_create_scratch() +{ + _scratch_mkfs_xfs $@ >> $seqres.full + + if ! _try_scratch_mount 2>/dev/null + then + echo "failed to mount $SCRATCH_DEV" + exit 1 + fi + + # fix the reserve block pool to a known size so that the enospc + # calculations work out correctly. + _scratch_resvblks 1024 > /dev/null 2>&1 +} + +_fill_scratch() +{ + $XFS_IO_PROG -f -c "resvsp 0 ${1}" $SCRATCH_MNT/resvfile +} + +_stress_scratch() +{ + procs=3 + nops=1000 + # -w ensures that the only ops are ones which cause write I/O + FSSTRESS_ARGS=`_scale_fsstress_args -d $SCRATCH_MNT -w -p $procs \ + -n $nops $FSSTRESS_AVOID` + $FSSTRESS_PROG $FSSTRESS_ARGS >> $seqres.full 2>&1 & +} + +_require_scratch +_require_xfs_io_command "falloc" + +_scratch_mkfs_xfs | tee -a $seqres.full | _filter_mkfs 2>$tmp.mkfs +. $tmp.mkfs # extract blocksize and data size for scratch device + +endsize=`expr 550 \* 1048576` # stop after growing this big +incsize=`expr 42 \* 1048576` # grow in chunks of this size +modsize=`expr 4 \* $incsize` # pause after this many increments + +[ `expr $endsize / $dbsize` -lt $dblocks ] || _notrun "Scratch device too small" + +nags=4 +size=`expr 125 \* 1048576` # 120 megabytes initially +sizeb=`expr $size / $dbsize` # in data blocks +logblks=$(_scratch_find_xfs_min_logblocks -dsize=${size} -dagcount=${nags}) +_create_scratch -lsize=${logblks}b -dsize=${size} -dagcount=${nags} + +for i in `seq 125 -1 90`; do + fillsize=`expr $i \* 1048576` + out="$(_fill_scratch $fillsize 2>&1)" + echo "$out" | grep -q 'No space left on device' && continue + test -n "${out}" && echo "$out" + break +done + +# +# Grow the filesystem while actively stressing it... +# Kick off more stress threads on each iteration, grow; repeat. +# +while [ $size -le $endsize ]; do + echo "*** stressing a ${sizeb} block filesystem" >> $seqres.full + _stress_scratch + size=`expr $size + $incsize` + sizeb=`expr $size / $dbsize` # in data blocks + echo "*** growing to a ${sizeb} block filesystem" >> $seqres.full + xfs_growfs -D ${sizeb} $SCRATCH_MNT >> $seqres.full + echo AGCOUNT=$agcount >> $seqres.full + echo >> $seqres.full + + sleep $((RANDOM % 3)) + _scratch_shutdown + ps -e | grep fsstress > /dev/null 2>&1 + while [ $? -eq 0 ]; do + killall -9 fsstress > /dev/null 2>&1 + wait > /dev/null 2>&1 + ps -e | grep fsstress > /dev/null 2>&1 + done + _scratch_cycle_mount || _fail "cycle mount failed" +done > /dev/null 2>&1 +wait # stop for any remaining stress processes + +_scratch_unmount + +status=0 +exit diff --git a/tests/xfs/609.out b/tests/xfs/609.out new file mode 100644 index 00000000..1853cc65 --- /dev/null +++ b/tests/xfs/609.out @@ -0,0 +1,7 @@ +QA output created by 609 +meta-data=DDEV isize=XXX agcount=N, agsize=XXX blks +data = bsize=XXX blocks=XXX, imaxpct=PCT + = sunit=XXX swidth=XXX, unwritten=X +naming =VERN bsize=XXX +log =LDEV bsize=XXX blocks=XXX +realtime =RDEV extsz=XXX blocks=XXX, rtextents=XXX
On Tue, Oct 08, 2024 at 12:28:37PM -0400, Brian Foster wrote: > FWIW, here's a quick hack at such a test. This is essentially a copy of > xfs/104, tweaked to remove some of the output noise and whatnot, and > hacked in some bits from generic/388 to do a shutdown and mount cycle > per iteration. > > I'm not sure if this reproduces your original problem, but this blows up > pretty quickly on 6.12.0-rc2. I see a stream of warnings that start like > this (buffer readahead path via log recovery): > > [ 2807.764283] XFS (vdb2): xfs_buf_map_verify: daddr 0x3e803 out of range, EOFS 0x3e800 > [ 2807.768094] ------------[ cut here ]------------ > [ 2807.770629] WARNING: CPU: 0 PID: 28386 at fs/xfs/xfs_buf.c:553 xfs_buf_get_map+0x184e/0x2670 [xfs] > > ... and then end up with an unrecoverable/unmountable fs. From the title > it sounds like this may be a different issue though.. hm? That's at least the same initial message I hit.
On Wed, Oct 09, 2024 at 10:04:51AM +0200, Christoph Hellwig wrote: > On Tue, Oct 08, 2024 at 12:28:37PM -0400, Brian Foster wrote: > > FWIW, here's a quick hack at such a test. This is essentially a copy of > > xfs/104, tweaked to remove some of the output noise and whatnot, and > > hacked in some bits from generic/388 to do a shutdown and mount cycle > > per iteration. > > > > I'm not sure if this reproduces your original problem, but this blows up > > pretty quickly on 6.12.0-rc2. I see a stream of warnings that start like > > this (buffer readahead path via log recovery): > > > > [ 2807.764283] XFS (vdb2): xfs_buf_map_verify: daddr 0x3e803 out of range, EOFS 0x3e800 > > [ 2807.768094] ------------[ cut here ]------------ > > [ 2807.770629] WARNING: CPU: 0 PID: 28386 at fs/xfs/xfs_buf.c:553 xfs_buf_get_map+0x184e/0x2670 [xfs] > > > > ... and then end up with an unrecoverable/unmountable fs. From the title > > it sounds like this may be a different issue though.. hm? > > That's at least the same initial message I hit. > > Ok, so then what happened? :) Are there outstanding patches somewhere to fix this problem? If so, I can give it a test with this. I'm also trying to figure out if the stress level of this particular test should be turned up a notch or three, but I can't really dig into that until this initial variant is passing reliably. Brian
On Wed, Oct 09, 2024 at 08:35:46AM -0400, Brian Foster wrote: > Ok, so then what happened? :) Are there outstanding patches somewhere to > fix this problem? If so, I can give it a test with this. Yes, "fix recovery of allocator ops after a growfs" from Sep 30.
On Wed, Oct 09, 2024 at 02:43:16PM +0200, Christoph Hellwig wrote: > On Wed, Oct 09, 2024 at 08:35:46AM -0400, Brian Foster wrote: > > Ok, so then what happened? :) Are there outstanding patches somewhere to > > fix this problem? If so, I can give it a test with this. > > Yes, "fix recovery of allocator ops after a growfs" from Sep 30. > Thanks. This seems to fix the unmountable fs problem, so I'd guess it's reproducing something related. The test still fails occasionally with a trans abort and I see some bnobt/cntbt corruption messages like the one appended below, but I'll leave to you to decide whether this is a regression or preexisting problem. I probably won't get through it today, but I'll try to take a closer look at the patches soon.. Brian ... XFS (vdb2): cntbt record corruption in AG 8 detected at xfs_alloc_check_irec+0xfa/0x160 [xfs]! XFS (vdb2): start block 0xa block count 0x1f36 XFS (vdb2): Internal error xfs_trans_cancel at line 872 of file fs/xfs/xfs_trans.c. Caller xfs_symlink+0x5a6/0xbd0 [xfs] CPU: 5 UID: 0 PID: 8625 Comm: fsstress Tainted: G E 6.12.0-rc2+ #251 Tainted: [E]=UNSIGNED_MODULE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x8d/0xb0 xfs_trans_cancel+0x3ca/0x530 [xfs] xfs_symlink+0x5a6/0xbd0 [xfs] ? __pfx_xfs_symlink+0x10/0x10 [xfs] ? avc_has_perm+0x77/0x110 ? lock_is_held_type+0xcd/0x120 ? __pfx_avc_has_perm+0x10/0x10 ? avc_has_perm_noaudit+0x3a/0x280 ? may_create+0x26a/0x2e0 xfs_vn_symlink+0x144/0x390 [xfs] ? __pfx_selinux_inode_permission+0x10/0x10 ? __pfx_xfs_vn_symlink+0x10/0x10 [xfs] vfs_symlink+0x33e/0x580 do_symlinkat+0x1cf/0x250 ? __pfx_do_symlinkat+0x10/0x10 ? getname_flags.part.0+0xae/0x490 __x64_sys_symlink+0x71/0x90 do_syscall_64+0x93/0x180 ? do_syscall_64+0x9f/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fcb692378eb Code: 8b 0d 49 f5 0c 00 f7 d8 64 89 01 b9 ff ff ff ff eb d3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa b8 58 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 11 f5 0c 00 f7 d8 RSP: 002b:00007ffc547e52e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000058 RAX: ffffffffffffffda RBX: 000000003804a200 RCX: 00007fcb692378eb RDX: 0000000000000000 RSI: 0000000038049200 RDI: 000000003804a200 RBP: 0000000038049200 R08: 000000003804a440 R09: 00007fcb69307b20 R10: 0000000000000270 R11: 0000000000000206 R12: 000000003804a200 R13: 00007ffc547e5450 R14: 0000000078ba5238 R15: 00007fcb6912c6c8
On Wed, Oct 09, 2024 at 11:14:49AM -0400, Brian Foster wrote: > Thanks. This seems to fix the unmountable fs problem, so I'd guess it's > reproducing something related. Heh. > > The test still fails occasionally with a trans abort and I see some > bnobt/cntbt corruption messages like the one appended below, but I'll > leave to you to decide whether this is a regression or preexisting > problem. > > I probably won't get through it today, but I'll try to take a closer > look at the patches soon.. My bet is on pre-existing, but either way we should use the chance to fix this properly. I'm a little busy right now, but I'll try to get back to this soon and play with your test.
On Wed, Oct 09, 2024 at 11:14:49AM -0400, Brian Foster wrote: > Thanks. This seems to fix the unmountable fs problem, so I'd guess it's > reproducing something related. > > The test still fails occasionally with a trans abort and I see some > bnobt/cntbt corruption messages like the one appended below, but I'll > leave to you to decide whether this is a regression or preexisting > problem. That's because log recovery completely fails to update the in-core state for the last existing AG. I've added a fix for that.
diff --git a/tests/xfs/1323 b/tests/xfs/1323 new file mode 100755 index 000000000..a436510b0 --- /dev/null +++ b/tests/xfs/1323 @@ -0,0 +1,61 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2024, Christoph Hellwig +# +# FS QA Test No. 1323 +# +# Test that recovering an extfree item residing on a freshly grown AG works. +# +. ./common/preamble +_begin_fstest auto quick growfs + +. ./common/filter +. ./common/inject + +_require_xfs_io_error_injection "free_extent" + +_xfs_force_bdev data $SCRATCH_MNT + +_cleanup() +{ + cd / + _scratch_unmount > /dev/null 2>&1 + rm -rf $tmp.* +} + +echo "Format filesystem" +_scratch_mkfs_sized $((128 * 1024 * 1024)) >> $seqres.full +_scratch_mount >> $seqres.full + +echo "Fill file system" +dd if=/dev/zero of=$SCRATCH_MNT/filler1 bs=64k oflag=direct &>/dev/null +sync +dd if=/dev/zero of=$SCRATCH_MNT/filler2 bs=64k oflag=direct &>/dev/null +sync + +echo "Grow file system" +$XFS_GROWFS_PROG $SCRATCH_MNT >>$seqres.full + +echo "Create test files" +dd if=/dev/zero of=$SCRATCH_MNT/test1 bs=8M count=4 oflag=direct | \ + _filter_dd +dd if=/dev/zero of=$SCRATCH_MNT/test2 bs=8M count=4 oflag=direct | \ + _filter_dd + +echo "Inject error" +_scratch_inject_error "free_extent" + +echo "Remove test file" +rm $SCRATCH_MNT/test2 + +echo "FS should be shut down, touch will fail" +touch $SCRATCH_MNT/test1 2>&1 | _filter_scratch + +echo "Remount to replay log" +_scratch_remount_dump_log >> $seqres.full + +echo "Done" + +# success, all done +status=0 +exit diff --git a/tests/xfs/1323.out b/tests/xfs/1323.out new file mode 100644 index 000000000..1740f9a1f --- /dev/null +++ b/tests/xfs/1323.out @@ -0,0 +1,14 @@ +QA output created by 1323 +Format filesystem +Fill file system +Grow file system +Create test files +4+0 records in +4+0 records out +4+0 records in +4+0 records out +Inject error +Remove test file +FS should be shut down, touch will fail +Remount to replay log +Done
Reproduce a bug where log recovery fails when an unfinised extent free intent is in the same log as the growfs transaction that added the AG. Signed-off-by: Christoph Hellwig <hch@lst.de> --- tests/xfs/1323 | 61 ++++++++++++++++++++++++++++++++++++++++++++++ tests/xfs/1323.out | 14 +++++++++++ 2 files changed, 75 insertions(+) create mode 100755 tests/xfs/1323 create mode 100644 tests/xfs/1323.out