diff mbox series

[v6,20/25] replay: wake up vCPU when replaying

Message ID 20180912081945.3228.19776.stgit@pasha-VirtualBox (mailing list archive)
State New, archived
Headers show
Series Fixing record/replay and adding reverse debugging | expand

Commit Message

Pavel Dovgalyuk Sept. 12, 2018, 8:19 a.m. UTC
In record/replay icount mode vCPU thread and iothread synchronize
the execution using the checkpoints.
vCPU thread processes the virtual timers and iothread processes all others.
When iothread wants to wake up sleeping vCPU thread, it sends dummy queued
work. Therefore it could be the following sequence of the events in
record mode:
 - IO: sending dummy work
 - IO: processing timers
 - CPU: wakeup
 - CPU: clearing dummy work
 - CPU: processing virtual timers

But due to the races in replay mode the sequence may change:
 - IO: sending dummy work
 - CPU: wakeup
 - CPU: clearing dummy work
 - CPU: sleeping again because nothing to do
 - IO: Processing timers
 - CPU: zzzz

In this case vCPU will not wake up, because dummy work is not to be set up
again.

This patch tries to wake up the vCPU when it sleeps and the icount warp
checkpoint isn't met. It means that vCPU has something to do, because
there are no other reasons of non-matching warp checkpoint.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

--

v5: improve checking that vCPU is still sleeping
---
 cpus.c                  |   31 +++++++++++++++++++++----------
 include/sysemu/replay.h |    3 +++
 replay/replay.c         |   12 ++++++++++++
 3 files changed, 36 insertions(+), 10 deletions(-)

Comments

Paolo Bonzini Sept. 13, 2018, 10:12 a.m. UTC | #1
On 12/09/2018 10:19, Pavel Dovgalyuk wrote:
> This patch tries to wake up the vCPU when it sleeps and the icount warp
> checkpoint isn't met. It means that vCPU has something to do, because
> there are no other reasons of non-matching warp checkpoint.

What happens if !replay_has_checkpoint()?  Should that be an assertion?

Thanks,

Paolo
Pavel Dovgalyuk Sept. 13, 2018, 11:06 a.m. UTC | #2
> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> On 12/09/2018 10:19, Pavel Dovgalyuk wrote:
> > This patch tries to wake up the vCPU when it sleeps and the icount warp
> > checkpoint isn't met. It means that vCPU has something to do, because
> > there are no other reasons of non-matching warp checkpoint.
> 
> What happens if !replay_has_checkpoint()?  Should that be an assertion?

The condition may be true, only when vCPU thread is sleeping.
In all other cases (e.g., running) the condition is false and we
have nothing to do.

Pavel Dovgalyuk
diff mbox series

Patch

diff --git a/cpus.c b/cpus.c
index f31a70a..b12a02f 100644
--- a/cpus.c
+++ b/cpus.c
@@ -576,18 +576,29 @@  void qemu_start_warp_timer(void)
         return;
     }
 
-    /* warp clock deterministically in record/replay mode */
-    if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_START)) {
-        return;
-    }
+    if (replay_mode != REPLAY_MODE_PLAY) {
+        if (!all_cpu_threads_idle()) {
+            return;
+        }
 
-    if (!all_cpu_threads_idle()) {
-        return;
-    }
+        if (qtest_enabled()) {
+            /* When testing, qtest commands advance icount.  */
+            return;
+        }
 
-    if (qtest_enabled()) {
-        /* When testing, qtest commands advance icount.  */
-        return;
+        replay_checkpoint(CHECKPOINT_CLOCK_WARP_START);
+    } else {
+        /* warp clock deterministically in record/replay mode */
+        if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_START)) {
+            /* vCPU is sleeping and warp can't be started.
+               It is probably a race condition: notification sent
+               to vCPU was processed in advance and vCPU went to sleep.
+               Therefore we have to wake it up for doing someting. */
+            if (replay_has_checkpoint()) {
+                qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
+            }
+            return;
+        }
     }
 
     /* We want to use the earliest deadline from ALL vm_clocks */
diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h
index c1646b0..3130e45 100644
--- a/include/sysemu/replay.h
+++ b/include/sysemu/replay.h
@@ -144,6 +144,9 @@  void replay_shutdown_request(ShutdownCause cause);
     Returns 0 in PLAY mode if checkpoint was not found.
     Returns 1 in all other cases. */
 bool replay_checkpoint(ReplayCheckpoint checkpoint);
+/*! Used to determine that checkpoint is pending.
+    Does not proceed to the next event in the log. */
+bool replay_has_checkpoint(void);
 
 /* Asynchronous events queue */
 
diff --git a/replay/replay.c b/replay/replay.c
index a106211..2145686 100644
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -241,6 +241,18 @@  out:
     return res;
 }
 
+bool replay_has_checkpoint(void)
+{
+    bool res = false;
+    if (replay_mode == REPLAY_MODE_PLAY) {
+        g_assert(replay_mutex_locked());
+        replay_account_executed_instructions();
+        res = EVENT_CHECKPOINT <= replay_state.data_kind
+              && replay_state.data_kind <= EVENT_CHECKPOINT_LAST;
+    }
+    return res;
+}
+
 static void replay_enable(const char *fname, int mode)
 {
     const char *fmode = NULL;