Message ID | 20231112233325.103250-1-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fstests: btrfs: test snapshot creation with existing qgroup | expand |
On Sun, Nov 12, 2023 at 11:33 PM Qu Wenruo <wqu@suse.com> wrote: > > [BUG] > There is a sysbot regression report about transaction abort during > snapshot creation, which is caused by the new timing of qgroup creation > and too strict error check. > > [FIX] > The proper fix is already submitted, with the title "btrfs: do not abort > transaction if there is already an existing qgroup". > > [TEST] > The new test case would reproduce the regression by: > > - Create a subvolume and a snapshot of it > > - Record the subvolumeid of the snapshot > > - Re-create the fs > Since btrfs won't reuse the subvolume id, we have to re-create the fs. > > - Enable quota and create a qgroup with the same subvolumeid > > - Create a subvolume and a snapshot of it > For unpatched and affected kernel (thankfully no release is affected), > the snapshot creation would fail due to aborted transaction. > > - Make sure the subvolume id doesn't change for the snapshot > There is one very hacky attempt to fix it by avoiding using the > subvolume id, which is completely wrong and would be caught by this > extra check. > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > tests/btrfs/303 | 80 +++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/303.out | 2 ++ > 2 files changed, 82 insertions(+) > create mode 100755 tests/btrfs/303 > create mode 100644 tests/btrfs/303.out > > diff --git a/tests/btrfs/303 b/tests/btrfs/303 > new file mode 100755 > index 00000000..fe924496 > --- /dev/null > +++ b/tests/btrfs/303 > @@ -0,0 +1,80 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (C) 2023 SUSE Linux Products GmbH. All Rights Reserved. > +# > +# FS QA Test 303 > +# > +# A regression test to make sure snapshot creation won't cause transaction > +# abort if there is already an existing qgroup. > +# > +. ./common/preamble > +_begin_fstest auto quick qgroup Also 'snapshot' and 'subvol' groups. > + > +. ./common/filter > + > +_supported_fs btrfs > +_require_scratch > + > +_fixed_by_kernel_commit xxxxxxxxxxxx \ > + "btrfs: do not abort transaction if there is already an existing qgroup" > + > +_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed" > +_scratch_mount > + > +# Create the first subvolume and get its id. > +# This subvolume id should not change no matter if there is an existing > +# qgroup for it. > +$BTRFS_UTIL_PROG subvolume create "$SCRATCH_MNT/subvol" >> $seqres.full > +$BTRFS_UTIL_PROG subvolume snapshot "$SCRATCH_MNT/subvol" \ > + "$SCRATCH_MNT/snapshot">> $seqres.full > + > +init_subvolid=$(_btrfs_get_subvolid "$SCRATCH_MNT" "snapshot") > + > +if [ -z "$init_subvolid" ]; then > + _fail "Unable to get the subvolid of the first snapshot" > +fi > + > +echo "Subvolumeid: ${init_subvolid}" >> $seqres.full > + > +_scratch_unmount > + > +# Re-create the fs, as btrfs won't reuse the subvolume id. > +_scratch_mkfs >> $seqres.full 2>&1 || _fail "2nd mkfs failed" > +_scratch_mount > + > +$BTRFS_UTIL_PROG quota enable "$SCRATCH_MNT" >> $seqres.full > +$BTRFS_UTIL_PROG quota rescan -w "$SCRATCH_MNT" >> $seqres.full > + > +# Create a qgroup for the first subvolume, this would make the later > +# subvolume creation to find an existing qgroup, and abort transaction. > +$BTRFS_UTIL_PROG qgroup create 0/"$init_subvolid" "$SCRATCH_MNT" >> $seqres.full > +sync This sync is not needed. An unpatched kernel still fails, and a patched kernel passes this test without the sync. Also, please always comment on why a sync is needed. In this case it can be removed because it's redundant. > + > +# Now create the first snapshot, which should have the same subvolid no matter > +# if the quota is enabled. > +$BTRFS_UTIL_PROG subvolume create "$SCRATCH_MNT/subvol" >> $seqres.full > +$BTRFS_UTIL_PROG subvolume snapshot "$SCRATCH_MNT/subvol" \ > + "$SCRATCH_MNT/snapshot">> $seqres.full > + > +# Either the snapshot create failed and transaction is aborted thus no > +# snapshot here, or we should be able to create the snapshot. > +new_subvolid=$(_btrfs_get_subvolid "$SCRATCH_MNT" "snapshot") > + > +echo "Subvolumeid: ${new_subvolid}" >> $seqres.full > + > +if [ -z "$new_subvolid" ]; then > + _fail "Unable to get the subvolid of the first snapshot" > +fi > + > +# Make sure the subvolumeid for the first snapshot didn't change. > +if [ "$new_subvolid" -ne "$init_subvolid" ]; then > + _fail "Subvolumeid for the first snapshot changed, has ${new_subvolid} expect ${init_subvolid}" > +fi > + > +_scratch_unmount This explicit unmount is not needed, the fstests framework automatically does that. Otherwise it looks fine, thanks. > + > +echo "Silence is golden" > + > +# success, all done > +status=0 > +exit > diff --git a/tests/btrfs/303.out b/tests/btrfs/303.out > new file mode 100644 > index 00000000..d48808e6 > --- /dev/null > +++ b/tests/btrfs/303.out > @@ -0,0 +1,2 @@ > +QA output created by 303 > +Silence is golden > -- > 2.42.0 > >
On 2023/11/14 00:03, Filipe Manana wrote: > On Sun, Nov 12, 2023 at 11:33 PM Qu Wenruo <wqu@suse.com> wrote: >> >> [BUG] >> There is a sysbot regression report about transaction abort during >> snapshot creation, which is caused by the new timing of qgroup creation >> and too strict error check. >> >> [FIX] >> The proper fix is already submitted, with the title "btrfs: do not abort >> transaction if there is already an existing qgroup". >> >> [TEST] >> The new test case would reproduce the regression by: >> >> - Create a subvolume and a snapshot of it >> >> - Record the subvolumeid of the snapshot >> >> - Re-create the fs >> Since btrfs won't reuse the subvolume id, we have to re-create the fs. >> >> - Enable quota and create a qgroup with the same subvolumeid >> >> - Create a subvolume and a snapshot of it >> For unpatched and affected kernel (thankfully no release is affected), >> the snapshot creation would fail due to aborted transaction. >> >> - Make sure the subvolume id doesn't change for the snapshot >> There is one very hacky attempt to fix it by avoiding using the >> subvolume id, which is completely wrong and would be caught by this >> extra check. >> >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> tests/btrfs/303 | 80 +++++++++++++++++++++++++++++++++++++++++++++ >> tests/btrfs/303.out | 2 ++ >> 2 files changed, 82 insertions(+) >> create mode 100755 tests/btrfs/303 >> create mode 100644 tests/btrfs/303.out >> >> diff --git a/tests/btrfs/303 b/tests/btrfs/303 >> new file mode 100755 >> index 00000000..fe924496 >> --- /dev/null >> +++ b/tests/btrfs/303 >> @@ -0,0 +1,80 @@ >> +#! /bin/bash >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright (C) 2023 SUSE Linux Products GmbH. All Rights Reserved. >> +# >> +# FS QA Test 303 >> +# >> +# A regression test to make sure snapshot creation won't cause transaction >> +# abort if there is already an existing qgroup. >> +# >> +. ./common/preamble >> +_begin_fstest auto quick qgroup > > Also 'snapshot' and 'subvol' groups. > >> + >> +. ./common/filter >> + >> +_supported_fs btrfs >> +_require_scratch >> + >> +_fixed_by_kernel_commit xxxxxxxxxxxx \ >> + "btrfs: do not abort transaction if there is already an existing qgroup" >> + >> +_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed" >> +_scratch_mount >> + >> +# Create the first subvolume and get its id. >> +# This subvolume id should not change no matter if there is an existing >> +# qgroup for it. >> +$BTRFS_UTIL_PROG subvolume create "$SCRATCH_MNT/subvol" >> $seqres.full >> +$BTRFS_UTIL_PROG subvolume snapshot "$SCRATCH_MNT/subvol" \ >> + "$SCRATCH_MNT/snapshot">> $seqres.full >> + >> +init_subvolid=$(_btrfs_get_subvolid "$SCRATCH_MNT" "snapshot") >> + >> +if [ -z "$init_subvolid" ]; then >> + _fail "Unable to get the subvolid of the first snapshot" >> +fi >> + >> +echo "Subvolumeid: ${init_subvolid}" >> $seqres.full >> + >> +_scratch_unmount >> + >> +# Re-create the fs, as btrfs won't reuse the subvolume id. >> +_scratch_mkfs >> $seqres.full 2>&1 || _fail "2nd mkfs failed" >> +_scratch_mount >> + >> +$BTRFS_UTIL_PROG quota enable "$SCRATCH_MNT" >> $seqres.full >> +$BTRFS_UTIL_PROG quota rescan -w "$SCRATCH_MNT" >> $seqres.full >> + >> +# Create a qgroup for the first subvolume, this would make the later >> +# subvolume creation to find an existing qgroup, and abort transaction. >> +$BTRFS_UTIL_PROG qgroup create 0/"$init_subvolid" "$SCRATCH_MNT" >> $seqres.full >> +sync > > This sync is not needed. An unpatched kernel still fails, and a > patched kernel passes this test without the sync. > > Also, please always comment on why a sync is needed. > In this case it can be removed because it's redundant. Would address all comments, but here I'm a little curious about the "sync" comment principle. I totally understand for this particular case the "sync" is unnecessary, the qgroup item doesn't need to be committed, thus it's totally reasonable to remove this sync. But I'm wondering is there any other reasons why we should avoid unnecessary "sync"s? Like slowing down the test or just to improve our awareness and avoid sync-happy guys? Thanks, Qu > >> + >> +# Now create the first snapshot, which should have the same subvolid no matter >> +# if the quota is enabled. >> +$BTRFS_UTIL_PROG subvolume create "$SCRATCH_MNT/subvol" >> $seqres.full >> +$BTRFS_UTIL_PROG subvolume snapshot "$SCRATCH_MNT/subvol" \ >> + "$SCRATCH_MNT/snapshot">> $seqres.full >> + >> +# Either the snapshot create failed and transaction is aborted thus no >> +# snapshot here, or we should be able to create the snapshot. >> +new_subvolid=$(_btrfs_get_subvolid "$SCRATCH_MNT" "snapshot") >> + >> +echo "Subvolumeid: ${new_subvolid}" >> $seqres.full >> + >> +if [ -z "$new_subvolid" ]; then >> + _fail "Unable to get the subvolid of the first snapshot" >> +fi >> + >> +# Make sure the subvolumeid for the first snapshot didn't change. >> +if [ "$new_subvolid" -ne "$init_subvolid" ]; then >> + _fail "Subvolumeid for the first snapshot changed, has ${new_subvolid} expect ${init_subvolid}" >> +fi >> + >> +_scratch_unmount > > This explicit unmount is not needed, the fstests framework > automatically does that. > > Otherwise it looks fine, thanks. > >> + >> +echo "Silence is golden" >> + >> +# success, all done >> +status=0 >> +exit >> diff --git a/tests/btrfs/303.out b/tests/btrfs/303.out >> new file mode 100644 >> index 00000000..d48808e6 >> --- /dev/null >> +++ b/tests/btrfs/303.out >> @@ -0,0 +1,2 @@ >> +QA output created by 303 >> +Silence is golden >> -- >> 2.42.0 >> >> >
On Tue, Nov 14, 2023 at 2:56 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > On 2023/11/14 00:03, Filipe Manana wrote: > > On Sun, Nov 12, 2023 at 11:33 PM Qu Wenruo <wqu@suse.com> wrote: > >> > >> [BUG] > >> There is a sysbot regression report about transaction abort during > >> snapshot creation, which is caused by the new timing of qgroup creation > >> and too strict error check. > >> > >> [FIX] > >> The proper fix is already submitted, with the title "btrfs: do not abort > >> transaction if there is already an existing qgroup". > >> > >> [TEST] > >> The new test case would reproduce the regression by: > >> > >> - Create a subvolume and a snapshot of it > >> > >> - Record the subvolumeid of the snapshot > >> > >> - Re-create the fs > >> Since btrfs won't reuse the subvolume id, we have to re-create the fs. > >> > >> - Enable quota and create a qgroup with the same subvolumeid > >> > >> - Create a subvolume and a snapshot of it > >> For unpatched and affected kernel (thankfully no release is affected), > >> the snapshot creation would fail due to aborted transaction. > >> > >> - Make sure the subvolume id doesn't change for the snapshot > >> There is one very hacky attempt to fix it by avoiding using the > >> subvolume id, which is completely wrong and would be caught by this > >> extra check. > >> > >> Signed-off-by: Qu Wenruo <wqu@suse.com> > >> --- > >> tests/btrfs/303 | 80 +++++++++++++++++++++++++++++++++++++++++++++ > >> tests/btrfs/303.out | 2 ++ > >> 2 files changed, 82 insertions(+) > >> create mode 100755 tests/btrfs/303 > >> create mode 100644 tests/btrfs/303.out > >> > >> diff --git a/tests/btrfs/303 b/tests/btrfs/303 > >> new file mode 100755 > >> index 00000000..fe924496 > >> --- /dev/null > >> +++ b/tests/btrfs/303 > >> @@ -0,0 +1,80 @@ > >> +#! /bin/bash > >> +# SPDX-License-Identifier: GPL-2.0 > >> +# Copyright (C) 2023 SUSE Linux Products GmbH. All Rights Reserved. > >> +# > >> +# FS QA Test 303 > >> +# > >> +# A regression test to make sure snapshot creation won't cause transaction > >> +# abort if there is already an existing qgroup. > >> +# > >> +. ./common/preamble > >> +_begin_fstest auto quick qgroup > > > > Also 'snapshot' and 'subvol' groups. > > > >> + > >> +. ./common/filter > >> + > >> +_supported_fs btrfs > >> +_require_scratch > >> + > >> +_fixed_by_kernel_commit xxxxxxxxxxxx \ > >> + "btrfs: do not abort transaction if there is already an existing qgroup" > >> + > >> +_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed" > >> +_scratch_mount > >> + > >> +# Create the first subvolume and get its id. > >> +# This subvolume id should not change no matter if there is an existing > >> +# qgroup for it. > >> +$BTRFS_UTIL_PROG subvolume create "$SCRATCH_MNT/subvol" >> $seqres.full > >> +$BTRFS_UTIL_PROG subvolume snapshot "$SCRATCH_MNT/subvol" \ > >> + "$SCRATCH_MNT/snapshot">> $seqres.full > >> + > >> +init_subvolid=$(_btrfs_get_subvolid "$SCRATCH_MNT" "snapshot") > >> + > >> +if [ -z "$init_subvolid" ]; then > >> + _fail "Unable to get the subvolid of the first snapshot" > >> +fi > >> + > >> +echo "Subvolumeid: ${init_subvolid}" >> $seqres.full > >> + > >> +_scratch_unmount > >> + > >> +# Re-create the fs, as btrfs won't reuse the subvolume id. > >> +_scratch_mkfs >> $seqres.full 2>&1 || _fail "2nd mkfs failed" > >> +_scratch_mount > >> + > >> +$BTRFS_UTIL_PROG quota enable "$SCRATCH_MNT" >> $seqres.full > >> +$BTRFS_UTIL_PROG quota rescan -w "$SCRATCH_MNT" >> $seqres.full > >> + > >> +# Create a qgroup for the first subvolume, this would make the later > >> +# subvolume creation to find an existing qgroup, and abort transaction. > >> +$BTRFS_UTIL_PROG qgroup create 0/"$init_subvolid" "$SCRATCH_MNT" >> $seqres.full > >> +sync > > > > This sync is not needed. An unpatched kernel still fails, and a > > patched kernel passes this test without the sync. > > > > Also, please always comment on why a sync is needed. > > In this case it can be removed because it's redundant. > > Would address all comments, but here I'm a little curious about the > "sync" comment principle. > > I totally understand for this particular case the "sync" is unnecessary, > the qgroup item doesn't need to be committed, thus it's totally > reasonable to remove this sync. > > But I'm wondering is there any other reasons why we should avoid > unnecessary "sync"s? It's a principle of clarity and simplicity. Adding a comment like "Make the qgroup item persisted in the qgroup btree, commit the current transaction", makes it clear why the sync is needed, what it accomplishes. It makes the test easier to read for other people, and easier to maintain. In this case it's a simplicity principle also because if it's not needed, it's better to not be there, otherwise it just makes it confusing and longer than necessary. > Like slowing down the test or just to improve our awareness and avoid > sync-happy guys? My comment is not about performance. I run tests in a VM dedicated to that, so adding a sync will at most increase run time by a few milliseconds - that's insignificant and therefore I don't care. All I care about is that the test is clear and has no unnecessary steps - be it a sync or something else. Thanks. > > Thanks, > Qu > > > >> + > >> +# Now create the first snapshot, which should have the same subvolid no matter > >> +# if the quota is enabled. > >> +$BTRFS_UTIL_PROG subvolume create "$SCRATCH_MNT/subvol" >> $seqres.full > >> +$BTRFS_UTIL_PROG subvolume snapshot "$SCRATCH_MNT/subvol" \ > >> + "$SCRATCH_MNT/snapshot">> $seqres.full > >> + > >> +# Either the snapshot create failed and transaction is aborted thus no > >> +# snapshot here, or we should be able to create the snapshot. > >> +new_subvolid=$(_btrfs_get_subvolid "$SCRATCH_MNT" "snapshot") > >> + > >> +echo "Subvolumeid: ${new_subvolid}" >> $seqres.full > >> + > >> +if [ -z "$new_subvolid" ]; then > >> + _fail "Unable to get the subvolid of the first snapshot" > >> +fi > >> + > >> +# Make sure the subvolumeid for the first snapshot didn't change. > >> +if [ "$new_subvolid" -ne "$init_subvolid" ]; then > >> + _fail "Subvolumeid for the first snapshot changed, has ${new_subvolid} expect ${init_subvolid}" > >> +fi > >> + > >> +_scratch_unmount > > > > This explicit unmount is not needed, the fstests framework > > automatically does that. > > > > Otherwise it looks fine, thanks. > > > >> + > >> +echo "Silence is golden" > >> + > >> +# success, all done > >> +status=0 > >> +exit > >> diff --git a/tests/btrfs/303.out b/tests/btrfs/303.out > >> new file mode 100644 > >> index 00000000..d48808e6 > >> --- /dev/null > >> +++ b/tests/btrfs/303.out > >> @@ -0,0 +1,2 @@ > >> +QA output created by 303 > >> +Silence is golden > >> -- > >> 2.42.0 > >> > >> > >
diff --git a/tests/btrfs/303 b/tests/btrfs/303 new file mode 100755 index 00000000..fe924496 --- /dev/null +++ b/tests/btrfs/303 @@ -0,0 +1,80 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2023 SUSE Linux Products GmbH. All Rights Reserved. +# +# FS QA Test 303 +# +# A regression test to make sure snapshot creation won't cause transaction +# abort if there is already an existing qgroup. +# +. ./common/preamble +_begin_fstest auto quick qgroup + +. ./common/filter + +_supported_fs btrfs +_require_scratch + +_fixed_by_kernel_commit xxxxxxxxxxxx \ + "btrfs: do not abort transaction if there is already an existing qgroup" + +_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed" +_scratch_mount + +# Create the first subvolume and get its id. +# This subvolume id should not change no matter if there is an existing +# qgroup for it. +$BTRFS_UTIL_PROG subvolume create "$SCRATCH_MNT/subvol" >> $seqres.full +$BTRFS_UTIL_PROG subvolume snapshot "$SCRATCH_MNT/subvol" \ + "$SCRATCH_MNT/snapshot">> $seqres.full + +init_subvolid=$(_btrfs_get_subvolid "$SCRATCH_MNT" "snapshot") + +if [ -z "$init_subvolid" ]; then + _fail "Unable to get the subvolid of the first snapshot" +fi + +echo "Subvolumeid: ${init_subvolid}" >> $seqres.full + +_scratch_unmount + +# Re-create the fs, as btrfs won't reuse the subvolume id. +_scratch_mkfs >> $seqres.full 2>&1 || _fail "2nd mkfs failed" +_scratch_mount + +$BTRFS_UTIL_PROG quota enable "$SCRATCH_MNT" >> $seqres.full +$BTRFS_UTIL_PROG quota rescan -w "$SCRATCH_MNT" >> $seqres.full + +# Create a qgroup for the first subvolume, this would make the later +# subvolume creation to find an existing qgroup, and abort transaction. +$BTRFS_UTIL_PROG qgroup create 0/"$init_subvolid" "$SCRATCH_MNT" >> $seqres.full +sync + +# Now create the first snapshot, which should have the same subvolid no matter +# if the quota is enabled. +$BTRFS_UTIL_PROG subvolume create "$SCRATCH_MNT/subvol" >> $seqres.full +$BTRFS_UTIL_PROG subvolume snapshot "$SCRATCH_MNT/subvol" \ + "$SCRATCH_MNT/snapshot">> $seqres.full + +# Either the snapshot create failed and transaction is aborted thus no +# snapshot here, or we should be able to create the snapshot. +new_subvolid=$(_btrfs_get_subvolid "$SCRATCH_MNT" "snapshot") + +echo "Subvolumeid: ${new_subvolid}" >> $seqres.full + +if [ -z "$new_subvolid" ]; then + _fail "Unable to get the subvolid of the first snapshot" +fi + +# Make sure the subvolumeid for the first snapshot didn't change. +if [ "$new_subvolid" -ne "$init_subvolid" ]; then + _fail "Subvolumeid for the first snapshot changed, has ${new_subvolid} expect ${init_subvolid}" +fi + +_scratch_unmount + +echo "Silence is golden" + +# success, all done +status=0 +exit diff --git a/tests/btrfs/303.out b/tests/btrfs/303.out new file mode 100644 index 00000000..d48808e6 --- /dev/null +++ b/tests/btrfs/303.out @@ -0,0 +1,2 @@ +QA output created by 303 +Silence is golden
[BUG] There is a sysbot regression report about transaction abort during snapshot creation, which is caused by the new timing of qgroup creation and too strict error check. [FIX] The proper fix is already submitted, with the title "btrfs: do not abort transaction if there is already an existing qgroup". [TEST] The new test case would reproduce the regression by: - Create a subvolume and a snapshot of it - Record the subvolumeid of the snapshot - Re-create the fs Since btrfs won't reuse the subvolume id, we have to re-create the fs. - Enable quota and create a qgroup with the same subvolumeid - Create a subvolume and a snapshot of it For unpatched and affected kernel (thankfully no release is affected), the snapshot creation would fail due to aborted transaction. - Make sure the subvolume id doesn't change for the snapshot There is one very hacky attempt to fix it by avoiding using the subvolume id, which is completely wrong and would be caught by this extra check. Signed-off-by: Qu Wenruo <wqu@suse.com> --- tests/btrfs/303 | 80 +++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/303.out | 2 ++ 2 files changed, 82 insertions(+) create mode 100755 tests/btrfs/303 create mode 100644 tests/btrfs/303.out