Message ID | 20250311050346.5868-2-ddiss@suse.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | f2fs/008: avoid endless wait | expand |
On Tue, Mar 11, 2025 at 04:03:47PM +1100, David Disseldorp wrote: > "udevadm wait <dev>" without a --timeout=SECONDS parameter will wait > endlessly. Endless wait can be triggered in f2fs/008 by e.g. using a > zram device as a SCRATCH_DEV, where "device type is unknown" failure > sees the /dev/mapper/$vgname-$lvname node never appear. > > Signed-off-by: David Disseldorp <ddiss@suse.de> > --- > It might make more sense to add a default timeout to _udev_wait(), but > that could also call udev{adm }settle. One of the points of moving to _udev_wait was to explicitly wait for the specific device to appear or disappear, indicating that the previous admin operations have completed. 'udevadm settle' does not do that - it waits for the global queue of events to drain and dev config failure doesn't generate udev events. Hence tests that silently fail dm/lvm device setup will continue to run on something they shouldn't have, and nobody will realise that the LVM setup did not run correctly. OTOH, _udev_wait() will hang in that situation because it's waiting for the device node to appear (or disappear) so it can be immediately used. We don't need udev to finish draining queues - we only need to wait for the device node to appear and the test is good to go. This behaviour provides obvious failures when device setup/teardown fails in some way. This should not happen, and when it does _udev_wait hanging forces that failure to be triaged and fixed immediately.... If we add a timeout to _udev_wait(), then we're back to the old behaviour where device config failures get ignored and the test runs incorrectly. Except now it is worse because we have to wait a timeout before the test is then run.... > > tests/f2fs/008 | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/tests/f2fs/008 b/tests/f2fs/008 > index 47696f2b..fdb0b0c2 100755 > --- a/tests/f2fs/008 > +++ b/tests/f2fs/008 > @@ -35,9 +35,12 @@ _cleanup() > rm -f $tmp.* > } > > -$LVM_PROG pvcreate -f $SCRATCH_DEV >>$seqres.full 2>&1 > -$LVM_PROG vgcreate -f $vgname $SCRATCH_DEV >>$seqres.full 2>&1 > -$LVM_PROG lvcreate -y -L 1024m -n $lvname $vgname >>$seqres.full 2>&1 > +$LVM_PROG pvcreate -f $SCRATCH_DEV >>$seqres.full 2>&1 \ > + || _notrun "unabled to create LVM physical volume" > +$LVM_PROG vgcreate -f $vgname $SCRATCH_DEV >>$seqres.full 2>&1 \ > + || _notrun "unabled to create LVM volume group" > +$LVM_PROG lvcreate -y -L 1024m -n $lvname $vgname >>$seqres.full 2>&1 \ > + || _notrun "unabled to create LVM logical volume" These should be _fail calls, IMO. AFAICT, there is no reason for LVM config to fail on a valid block device. If there is a block device that doesn't support LVM, then we've got lots of tests that are going to fail the same way and we should have a _require_scratch_lvm() to notrun those tests for that type of block device (or fix the block device...). -Dave.
diff --git a/tests/f2fs/008 b/tests/f2fs/008 index 47696f2b..fdb0b0c2 100755 --- a/tests/f2fs/008 +++ b/tests/f2fs/008 @@ -35,9 +35,12 @@ _cleanup() rm -f $tmp.* } -$LVM_PROG pvcreate -f $SCRATCH_DEV >>$seqres.full 2>&1 -$LVM_PROG vgcreate -f $vgname $SCRATCH_DEV >>$seqres.full 2>&1 -$LVM_PROG lvcreate -y -L 1024m -n $lvname $vgname >>$seqres.full 2>&1 +$LVM_PROG pvcreate -f $SCRATCH_DEV >>$seqres.full 2>&1 \ + || _notrun "unabled to create LVM physical volume" +$LVM_PROG vgcreate -f $vgname $SCRATCH_DEV >>$seqres.full 2>&1 \ + || _notrun "unabled to create LVM volume group" +$LVM_PROG lvcreate -y -L 1024m -n $lvname $vgname >>$seqres.full 2>&1 \ + || _notrun "unabled to create LVM logical volume" _udev_wait /dev/mapper/$vgname-$lvname _mkfs_dev /dev/mapper/$vgname-$lvname >>$seqres.full 2>&1
"udevadm wait <dev>" without a --timeout=SECONDS parameter will wait endlessly. Endless wait can be triggered in f2fs/008 by e.g. using a zram device as a SCRATCH_DEV, where "device type is unknown" failure sees the /dev/mapper/$vgname-$lvname node never appear. Signed-off-by: David Disseldorp <ddiss@suse.de> --- It might make more sense to add a default timeout to _udev_wait(), but that could also call udev{adm }settle. tests/f2fs/008 | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)