diff mbox series

[OSSTEST] guest_prepare_disk: Only do the umount if we set an env var

Message ID 20191021133215.26518-1-ian.jackson@eu.citrix.com (mailing list archive)
State New, archived
Headers show
Series [OSSTEST] guest_prepare_disk: Only do the umount if we set an env var | expand

Commit Message

Ian Jackson Oct. 21, 2019, 1:32 p.m. UTC
This call to guest_umount_lv is here for the benefit of ad-hoc reruns
of (eg) ts-guest-start tidy up any ad-hoc messing about (eg from
earlier runs of ts-debian-fixup or something).  It is not needed in
production runs.

Serendipitously, this osstest code discovered a bug in the Linux
blkback: when tearing down, it sets the backend state to 6 before it
has closed the underlying block devices.  This ultimately means that
after "xl destroy" or "xl shutdown -w" there is a period when the
guest's open handle onto its storage is still open.  This is wrong.

This detection depends on us winning a tricky race.  So it shows up in
osstest as a very low probability heisenbug.  The bug is currently in
all versions of Linux and causing a bit of a nuisance.

It would be best to add a proper check for this bug.  However, this is
quite fiddly: really, it ought to be done as close to the xl command
completion as possible, in the same ssh invocation.  That would
involve a fair bit of plumbing and ad-hocery.  I don't think that
would be proportionate for such a low-impact bug.

So instead in this patch I just disable this cleanup code in the
troublesome case, unless it is explicitly requested by the user
setting OSSTEST_GUEST_DISK_MOUNT_CLEANUP to a trueish value.  (This
would be reasonably convenient for the ad-hoc testing that this call
serves.)

Thanks to Roger for diagnosing the Linux kernel bug.

CC: Roger Pau Monne <roger.pau@citrix.com>
CC: Jürgen Groß <jgross@suse.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 Osstest/TestSupport.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Roger Pau Monné Oct. 21, 2019, 2:33 p.m. UTC | #1
On Mon, Oct 21, 2019 at 02:32:15PM +0100, Ian Jackson wrote:
> This call to guest_umount_lv is here for the benefit of ad-hoc reruns
> of (eg) ts-guest-start tidy up any ad-hoc messing about (eg from
> earlier runs of ts-debian-fixup or something).  It is not needed in
> production runs.
> 
> Serendipitously, this osstest code discovered a bug in the Linux
> blkback: when tearing down, it sets the backend state to 6 before it
> has closed the underlying block devices.  This ultimately means that
> after "xl destroy" or "xl shutdown -w" there is a period when the
> guest's open handle onto its storage is still open.  This is wrong.
> 
> This detection depends on us winning a tricky race.  So it shows up in
> osstest as a very low probability heisenbug.  The bug is currently in
> all versions of Linux and causing a bit of a nuisance.
> 
> It would be best to add a proper check for this bug.  However, this is
> quite fiddly: really, it ought to be done as close to the xl command
> completion as possible, in the same ssh invocation.  That would
> involve a fair bit of plumbing and ad-hocery.  I don't think that
> would be proportionate for such a low-impact bug.
> 
> So instead in this patch I just disable this cleanup code in the
> troublesome case, unless it is explicitly requested by the user
> setting OSSTEST_GUEST_DISK_MOUNT_CLEANUP to a trueish value.  (This
> would be reasonably convenient for the ad-hoc testing that this call
> serves.)
> 
> Thanks to Roger for diagnosing the Linux kernel bug.
> 
> CC: Roger Pau Monne <roger.pau@citrix.com>
> CC: Jürgen Groß <jgross@suse.com>
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>

Thanks:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
diff mbox series

Patch

diff --git a/Osstest/TestSupport.pm b/Osstest/TestSupport.pm
index 78f47480..6b0ee7a2 100644
--- a/Osstest/TestSupport.pm
+++ b/Osstest/TestSupport.pm
@@ -1938,7 +1938,8 @@  sub guest_create_paused ($) {
 sub guest_prepare_disk ($) {
     my ($gho) = @_;
 
-    guest_umount_lv($gho->{Host}, $gho);
+    guest_umount_lv($gho->{Host}, $gho)
+	if $ENV{'OSSTEST_GUEST_DISK_MOUNT_CLEANUP'};
 
     return if ($gho->{Diskfmt} // 'none') eq "none";