diff mbox

fstests: common/rc: fix device still mounted error with SCRATCH_DEV_POOL

Message ID 20180113010459.24321-1-bo.li.liu@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Liu Bo Jan. 13, 2018, 1:04 a.m. UTC
One of btrfs tests, btrfs/011, uses SCRATCH_DEV_POOL and puts a non-SCRATCH_DEV
device as the first one when doing mkfs, and this makes
_require_scratch{_nocheck} fail to umount $SCRATCH_MNT since it checks mount
point with SCRATCH_DEV only, and for sure it finds nothing to umount and the
following tests complain about 'device still mounted' alike errors.

Introduce a helper to address this special case where both btrfs and scratch
dev pool are in use.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 common/rc | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

Comments

Eryu Guan Jan. 15, 2018, 6:22 a.m. UTC | #1
On Fri, Jan 12, 2018 at 06:04:59PM -0700, Liu Bo wrote:
> One of btrfs tests, btrfs/011, uses SCRATCH_DEV_POOL and puts a non-SCRATCH_DEV
> device as the first one when doing mkfs, and this makes
> _require_scratch{_nocheck} fail to umount $SCRATCH_MNT since it checks mount
> point with SCRATCH_DEV only, and for sure it finds nothing to umount and the
> following tests complain about 'device still mounted' alike errors.
> 
> Introduce a helper to address this special case where both btrfs and scratch
> dev pool are in use.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Hmm, I didn't see this problem, I ran btrfs/011 then another tests that
uses $SCRATCH_DEV, and the second test ran fine too. Can you please
provide more details?

Anyway, I think we should fix btrfs/011 to either not use $SCRATCH_DEV
in replace operations (AFAIK, other btrfs replace tests do this) or
umount all devices before exit. And I noticed btrfs/011 does umount
$SCRATCH_MNT at the end of workout(), so usually all should be fine
(perhaps it would leave a device mounted if interrupted in the middle of
test run, because _cleanup() doesn't do umount).

Thanks,
Eryu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Jan. 15, 2018, 5:20 p.m. UTC | #2
On Mon, Jan 15, 2018 at 02:22:28PM +0800, Eryu Guan wrote:
> On Fri, Jan 12, 2018 at 06:04:59PM -0700, Liu Bo wrote:
> > One of btrfs tests, btrfs/011, uses SCRATCH_DEV_POOL and puts a non-SCRATCH_DEV
> > device as the first one when doing mkfs, and this makes
> > _require_scratch{_nocheck} fail to umount $SCRATCH_MNT since it checks mount
> > point with SCRATCH_DEV only, and for sure it finds nothing to umount and the
> > following tests complain about 'device still mounted' alike errors.
> > 
> > Introduce a helper to address this special case where both btrfs and scratch
> > dev pool are in use.
> > 
> > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> 
> Hmm, I didn't see this problem, I ran btrfs/011 then another tests that
> uses $SCRATCH_DEV, and the second test ran fine too. Can you please
> provide more details?
> 
> Anyway, I think we should fix btrfs/011 to either not use $SCRATCH_DEV
> in replace operations (AFAIK, other btrfs replace tests do this) or
> umount all devices before exit. And I noticed btrfs/011 does umount
> $SCRATCH_MNT at the end of workout(), so usually all should be fine
> (perhaps it would leave a device mounted if interrupted in the middle of
> test run, because _cleanup() doesn't do umount).

In my case I saw lots of test failures (btrfs/ 012 068 071 074 116 136
138 152 154 155 ...), some of them repeatedly but not reliably. This
could have been triggered by a patch in my testing branch, but I can't
tell for sure due to the inaccurate fstest checks. The common problem
was that the scratch device appeared as mounted.

We discussed that with Bo, I was suspecting some of our changes that
could theoretically leave some data in flight after umount. Bo found the
potential problems in fstests so I'll redo all the testing again with
updated fstests.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Bo Jan. 16, 2018, 7:10 a.m. UTC | #3
On Mon, Jan 15, 2018 at 02:22:28PM +0800, Eryu Guan wrote:
> On Fri, Jan 12, 2018 at 06:04:59PM -0700, Liu Bo wrote:
> > One of btrfs tests, btrfs/011, uses SCRATCH_DEV_POOL and puts a non-SCRATCH_DEV
> > device as the first one when doing mkfs, and this makes
> > _require_scratch{_nocheck} fail to umount $SCRATCH_MNT since it checks mount
> > point with SCRATCH_DEV only, and for sure it finds nothing to umount and the
> > following tests complain about 'device still mounted' alike errors.
> > 
> > Introduce a helper to address this special case where both btrfs and scratch
> > dev pool are in use.
> > 
> > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> 
> Hmm, I didn't see this problem, I ran btrfs/011 then another tests that
> uses $SCRATCH_DEV, and the second test ran fine too. Can you please
> provide more details?

Sure, so I was using 4 devices of size being 2500M, btrfs/011 bailed
out when doing a cp due to enospc then _fail is called to abort the
test, and the mount point now is associated with a different device
other than SCRATCH_DEV, so that _require_scratch_nocheck in btrfs/012
was not able to umount SCRATCH_MNT.

> 
> Anyway, I think we should fix btrfs/011 to either not use $SCRATCH_DEV
> in replace operations (AFAIK, other btrfs replace tests do this) or
> umount all devices before exit. And I noticed btrfs/011 does umount
> $SCRATCH_MNT at the end of workout(), so usually all should be fine
> (perhaps it would leave a device mounted if interrupted in the middle of
> test run, because _cleanup() doesn't do umount).

That's true, if you want, I could fix all btrfs replace tests to
umount SCRATCH_MNT right before exit.

thanks,
-liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eryu Guan Jan. 16, 2018, 7:48 a.m. UTC | #4
On Mon, Jan 15, 2018 at 11:10:20PM -0800, Liu Bo wrote:
> On Mon, Jan 15, 2018 at 02:22:28PM +0800, Eryu Guan wrote:
> > On Fri, Jan 12, 2018 at 06:04:59PM -0700, Liu Bo wrote:
> > > One of btrfs tests, btrfs/011, uses SCRATCH_DEV_POOL and puts a non-SCRATCH_DEV
> > > device as the first one when doing mkfs, and this makes
> > > _require_scratch{_nocheck} fail to umount $SCRATCH_MNT since it checks mount
> > > point with SCRATCH_DEV only, and for sure it finds nothing to umount and the
> > > following tests complain about 'device still mounted' alike errors.
> > > 
> > > Introduce a helper to address this special case where both btrfs and scratch
> > > dev pool are in use.
> > > 
> > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> > 
> > Hmm, I didn't see this problem, I ran btrfs/011 then another tests that
> > uses $SCRATCH_DEV, and the second test ran fine too. Can you please
> > provide more details?
> 
> Sure, so I was using 4 devices of size being 2500M, btrfs/011 bailed
> out when doing a cp due to enospc then _fail is called to abort the
> test, and the mount point now is associated with a different device
> other than SCRATCH_DEV, so that _require_scratch_nocheck in btrfs/012
> was not able to umount SCRATCH_MNT.

Yeah, that's the exact case I described as below. I think adding
_scratch_umount >/dev/null 2>&1 in _cleanup() would resolve your issue.

> 
> > 
> > Anyway, I think we should fix btrfs/011 to either not use $SCRATCH_DEV
> > in replace operations (AFAIK, other btrfs replace tests do this) or
> > umount all devices before exit. And I noticed btrfs/011 does umount
> > $SCRATCH_MNT at the end of workout(), so usually all should be fine
> > (perhaps it would leave a device mounted if interrupted in the middle of
> > test run, because _cleanup() doesn't do umount).
> 
> That's true, if you want, I could fix all btrfs replace tests to
> umount SCRATCH_MNT right before exit.

I think only the tests that replace $SCRATCH_DEV (as what btrfs/011
does) need fixes, _require_scratch would umount $SCRATCH_MNT for other
tests.

Thanks,
Eryu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/common/rc b/common/rc
index 9216efd..eab1bb7 100644
--- a/common/rc
+++ b/common/rc
@@ -1450,6 +1450,29 @@  _check_mounted_on()
 	return 0 # 0 = mounted as expected
 }
 
+_check_scratch_dev_pool_mounted_on()
+{
+	local devname=$1
+	local dev=$2
+	local mntname=$3
+	local mnt=$4
+
+	local mount_rec=`findmnt -rncv -M $mnt -o SOURCE,TARGET`
+	[ -n "$mount_rec" ] || return 1 # 1 = not mounted
+
+	# if it's mounted, make sure mount dev is one of the pool devices
+	for mount_dev in $SCRATCH_DEV_POOL; do
+		if [ "$mount_rec" == "$mount_dev $mnt" ]; then
+			return 0 # 0 = mounted as expected
+		fi
+	done
+
+	echo "$devname=$dev is mounted but not on $mntname=$mnt - aborting"
+	echo "Already mounted result:"
+	echo $mount_rec
+	return 2 # 2 == mounted on wrong mnt
+}
+
 # this test needs a scratch partition - check we're ok & unmount it
 # No post-test check of the device is required. e.g. the test intentionally
 # finishes the test with the filesystem in a corrupt state
@@ -1537,6 +1560,14 @@  _require_scratch_nocheck()
     _check_mounted_on SCRATCH_DEV $SCRATCH_DEV SCRATCH_MNT $SCRATCH_MNT
     local err=$?
     [ $err -le 1 ] || exit 1
+
+    if [ $err -eq 1 -a "$FSTYP" == "btrfs" -a ! -z "$SCRATCH_DEV_POOL" ]
+    then
+	_check_scratch_dev_pool_mounted_on SCRATCH_DEV $SCRATCH_DEV SCRATCH_MNT $SCRATCH_MNT
+	err=$?
+	[ $err -le 1 ] || exit 1
+    fi
+
     if [ $err -eq 0 ]
     then
         # if it's mounted, unmount it