diff mbox series

btrfs-progs: tests: fix random mkfs.btrfs failure due to loopdev cache

Message ID aa3f3c927b62d1da51166efafa856e18d01cc1ac.1692861033.git.anand.jain@oracle.com (mailing list archive)
State New, archived
Headers show
Series btrfs-progs: tests: fix random mkfs.btrfs failure due to loopdev cache | expand

Commit Message

Anand Jain Aug. 24, 2023, 7:13 a.m. UTC
Sometimes, I randomly see failures like below.

    [TEST/fsck]   013-extent-tree-rebuild
failed: /Volumes/ws/btrfs-progs/mkfs.btrfs -f /Volumes/ws/btrfs-progs/tests/test.img
test failed for case 013-extent-tree-rebuild
make: *** [Makefile:484: test-fsck] Error 1

Looks like losetup -D failed because the device busy, however if ran
again it would successeed, possible that loop device is still writing
to the backing store.

Using losetup directio option as below it never reproduced so far.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 tests/common | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Sterba Aug. 28, 2023, 3:46 p.m. UTC | #1
On Thu, Aug 24, 2023 at 03:13:04PM +0800, Anand Jain wrote:
> Sometimes, I randomly see failures like below.
> 
>     [TEST/fsck]   013-extent-tree-rebuild
> failed: /Volumes/ws/btrfs-progs/mkfs.btrfs -f /Volumes/ws/btrfs-progs/tests/test.img
> test failed for case 013-extent-tree-rebuild
> make: *** [Makefile:484: test-fsck] Error 1
> 
> Looks like losetup -D failed because the device busy, however if ran
> again it would successeed, possible that loop device is still writing
> to the backing store.
> 
> Using losetup directio option as below it never reproduced so far.

That's interesting that it makes it work because in the command of test
013-extent-tree-rebuild the helper prepare_loopdevs() is not used at all
and mkfs works on the file directly.

The default TEST_DEV is a plain file and transparently mounted as loop
device by run_check_mount_test_dev by -o loop, which uses defaults for
losetup. And the direct io is off by default, there's no mount option
for that.

I'm not sure how to fix that, I want to let the test suite run without
the need for a block device. A fix could be on the test level to force
the TEST_DEV to be a loop device, but handling it transparently inside
the mount/umount helpers may be fragile.

We could add helpers for tests that need a block device, which would
mean the prepare and cleanup calls. Alternatively the on-exit function
that can be done via signal traps can be used but then it's for test
case updated anyway.
diff mbox series

Patch

diff --git a/tests/common b/tests/common
index 602a4122f8bd..72ea8c688ec5 100644
--- a/tests/common
+++ b/tests/common
@@ -834,7 +834,7 @@  prepare_loopdevs()
 		chmod a+rw "$loopdev_prefix$i"
 		truncate -s0 "$loopdev_prefix$i"
 		truncate -s2g "$loopdev_prefix$i"
-		loopdevs[$i]=`run_check_stdout $SUDO_HELPER losetup --find --show "$loopdev_prefix$i"`
+		loopdevs[$i]=`run_check_stdout $SUDO_HELPER losetup --direct-io=on --find --show "$loopdev_prefix$i"`
 	done
 }