diff mbox series

[OSSTEST] host reuse fixes: Properly clear out old static tasks from history

Message ID 20201023161444.2133-1-iwj@xenproject.org (mailing list archive)
State New, archived
Headers show
Series [OSSTEST] host reuse fixes: Properly clear out old static tasks from history | expand

Commit Message

Ian Jackson Oct. 23, 2020, 4:14 p.m. UTC
The algorithm for clearing out old lifecycle entries was wrong: it
would delete all entries for non-live tasks.

In practice this would properly remove all the old entries for
non-static tasks, since ownd tasks typically don't releease things
until the task ends (and it becomes non-live).  And it wouldn't remove
more than it should do unless some now-not-live task had an allocation
overlapping with us, which is not supposed to be possible if we are
doing a host wipe.  But it would not remove static tasks ever, since
they are always live.

Change to a completely different algorithm:

 * Check that only us (ie, $ttaskid) has (any shares of) this host
   allocated.  There's a function resource_check_allocated_core which
   already does this and since we're conceptually part of Executive
   it is proper for us to call it.  This is just a sanity check.

 * Delete all lifecycle entries predating the first entry made by
   us.  (We could just delete all entries other than ours, but in
   theory maybe some future code could result in a siutation where
   someone else could have had another share briefly at some point.)

This removes old junk from the "Tasks that could have affected" in

Signed-off-by: Ian Jackson <iwj@xenproject.org>
 Osstest/JobDB/Executive.pm | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)
diff mbox series


diff --git a/Osstest/JobDB/Executive.pm b/Osstest/JobDB/Executive.pm
index 1dcf55ff..097c8d75 100644
--- a/Osstest/JobDB/Executive.pm
+++ b/Osstest/JobDB/Executive.pm
@@ -515,15 +515,19 @@  sub jobdb_host_update_lifecycle_info ($$$) { #method
     if ($mode eq 'wiped') {
 	db_retry($flight, [qw(running)], $dbh_tests,[], sub {
-            $dbh_tests->do(<<END, {}, $hostname);
-                DELETE FROM host_lifecycle h
-                      WHERE hostname=?
-                        AND NOT EXISTS(
-                SELECT 1
-		  FROM tasks t
-		 WHERE t.live
-		   AND t.taskid = h.taskid
-                );
+            my $cshare = Osstest::Executive::resource_check_allocated_core(
+                "host",$hostname);
+            die "others have this host allocated when we have just wiped it! "
+	      .Dumper($cshare)
+	      if $cshare->{Others};
+	    $dbh_tests->do(<<END, {}, $hostname, $hostname, $ttaskid);
+                DELETE FROM host_lifecycle
+		      WHERE hostname=?
+			AND lcseq < (
+			       SELECT min(lcseq) 
+				FROM host_lifecycle
+			       WHERE hostname=? and taskid=?
+			    )
 	logm("host lifecycle: $hostname: wiped, cleared out old info");