diff mbox series

[v2,4/4] KVM: selftests: Test if posted interrupt delivery race with migration

Message ID 20220828222544.1964917-5-mizhang@google.com (mailing list archive)
State New, archived
Headers show
Series Fix a race between posted interrupt delivery and migration in a nested VM | expand

Commit Message

Mingwei Zhang Aug. 28, 2022, 10:25 p.m. UTC
From: Jim Mattson <jmattson@google.com>

Test if posted interrupt delivery race with migration. Add a selftest
to demonstrate a race condition between migration and posted
interrupt for a nested VM. The consequence of this race condition causes
the loss of a posted interrupt for a nested vCPU after migration and
triggers a warning for unpatched kernel.

The selftest demonstrates that if a L2 vCPU is in halted state before
migration, then after migration, it is not able to receive a posted
interrupt from another vCPU within the same VM.

The fundamental problem is deeply buried in the kernel logic where
vcpu_block() will directly check vmcs12 related mappings before having a
valid vmcs12 ready.  Because of that, it fails to process the posted
interrupt and triggers the warning in vmx_guest_apic_has_interrupt()

static bool vmx_guest_apic_has_interrupt(struct kvm_vcpu *vcpu)
{
	...
	if (WARN_ON_ONCE(!is_guest_mode(vcpu)) ||
		!nested_cpu_has_vid(get_vmcs12(vcpu)) ||
		WARN_ON_ONCE(!vmx->nested.virtual_apic_map.gfn)) <= HERE
		return false;
	...
}

Signed-off-by: Jim Mattson <jmattson@google.com>
Co-developed-by: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../kvm/x86_64/vmx_migrate_pi_pending.c       | 291 ++++++++++++++++++
 3 files changed, 293 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/vmx_migrate_pi_pending.c

Comments

Sean Christopherson Aug. 29, 2022, 5:21 p.m. UTC | #1
On Sun, Aug 28, 2022, Mingwei Zhang wrote:
> From: Jim Mattson <jmattson@google.com>
> 
> Test if posted interrupt delivery race with migration. Add a selftest
> to demonstrate a race condition between migration and posted
> interrupt for a nested VM. The consequence of this race condition causes
> the loss of a posted interrupt for a nested vCPU after migration and
> triggers a warning for unpatched kernel.
> 
> The selftest demonstrates that if a L2 vCPU is in halted state before
> migration, then after migration, it is not able to receive a posted
> interrupt from another vCPU within the same VM.

For tests, try to phrase the changelog in terms of what architectural behavior is
being tested, as opposed to stating what exact KVM bug is being targeted.  It's
definitely helpful to call out the KVM bug, but do that _after_ explaining the
test itself.  The problem with talking about a specific KVM bug is that 

IIUC, this can be:

  Add a test to verify that a posted interrupt wakes the target vCPU from
  a halted state if the VM is migrated while the vCPU is halted.

and then something like this to call out the known KVM bug

  This test exposes a bug where KVM checks the vAPIC page before loading
  nested pages when the vCPU is blocking.

> The fundamental problem is deeply buried in the kernel logic where
> vcpu_block() will directly check vmcs12 related mappings before having a
> valid vmcs12 ready.  Because of that, it fails to process the posted
> interrupt and triggers the warning in vmx_guest_apic_has_interrupt()
> 
> static bool vmx_guest_apic_has_interrupt(struct kvm_vcpu *vcpu)
> {
> 	...
> 	if (WARN_ON_ONCE(!is_guest_mode(vcpu)) ||
> 		!nested_cpu_has_vid(get_vmcs12(vcpu)) ||
> 		WARN_ON_ONCE(!vmx->nested.virtual_apic_map.gfn)) <= HERE
> 		return false;
> 	...
> }
> +static void vcpu0_ipi_handler(struct ex_regs *regs)

This needs to clarify that it's an L2 handler, that info is _extremely_ important
to understanding this sequence.  And even if that info were superfluous, it's still
a good habit to use consistent namespacing.

> +{
> +	 asm volatile("inb %%dx, %%al"
> +		      : : [port] "d" (PORT_L0_EXIT) : "rax");

Can't this use GUEST_SYNC()?

> +	asm volatile("vmcall");

A comment would be helpful, but not as necessary if this is vcpu0_l2_ipi_handler().

> +}
> +
> +static void l2_vcpu0_guest_code(void)

I have a slight preference for

	vcpu0_l2_ipi_handler()
	vcpu0_l2_guest_code()

	vcpu1_l1_guest_code()

because the "vcpu0 vs. vcpu1" is the broader scope, and then "l1 vs. l2" further
clarifies the exact scope of the function.

> +{
> +	asm volatile("cli");
> +	asm volatile("sti; nop; hlt");

What is this code trying to do?  Assuming the intent is to ensure the posted IRQ
arrives after "hlt", this needs to be:

	GUEST_ASSERT(!irqs_enabled());
	asm volatile("sti; hlt");

because if interrupts are enabled when l2_vcpu0_guest_code() starts running, then
the IRQ can arrive before CLI.  And the "nop" needs to go because the nop will
consume the STI shadow, i.e. the IRQ can arrive before the "hlt".  irqs_enabled()
needs to be defined, but it's fairly straightfoward; something like this:

static __always_inline bool irqs_enabled(void)
{
	unsigned long flags;

	asm volatile("pushf ; pop %0"
		     : "=rm" (flags)
		     :
		     : "memory");

	return flags & X86_EFLAGS_IF;
}

If my assuming is wrong, then this needs a very verbose comment.

> +static void post_intr(u8 vector, void *pi_desc)
> +{
> +       set_bit(vector, pi_desc);
> +       set_bit(PI_ON_BIT, pi_desc);
> +}
> +
> +static void l1_vcpu1_guest_code(void *vcpu0_pi_desc)
> +{
> +       post_intr(L2_INTR, vcpu0_pi_desc);

Open code post_intr() here.  I would expect a "post_intr()" helper to actually do
the notification.  Separating the two things confused me.

> +       x2apic_enable();

Why take a dependency on x2APIC?  Either way, enable x2APIC, then post the
interrupt.  Again, it's weird splitting "set info in PI descriptor" from the
notification.

> +       x2apic_write_reg(APIC_ICR, ((u64)VCPU_ID0 << 32) |
> +                        APIC_DEST_PHYSICAL | APIC_DM_FIXED | PI_NV);
> +       GUEST_DONE();
> +}

...

> +void *create_and_run_vcpu1(void *arg)
> +{
> +	struct ucall uc;
> +	struct kvm_run *run;
> +	struct kvm_mp_state vcpu0_mp_state;
> +
> +	pthread_cpu1 = pthread_self();
> +
> +	/* Keep trying to kick out vcpu0 until it is in halted state. */
> +	for (;;) {
> +		WRITE_ONCE(vcpu0_can_run, true);
> +		sleep(1);
> +		WRITE_ONCE(vcpu0_can_run, false);
> +		pthread_kill(pthread_cpu0, SIGUSR1);
> +		printf("vcpu1: Sent SIGUSR1 to vcpu0\n");

Use pr_debug(), this has the potential to spam the console.  And then probably use
pr_info() for the other printfs so that they can be turned off via QUIET.

> +
> +		while (READ_ONCE(vcpu0_running))
> +			;

vcpu0_running is unnecessary.  KVM needs to acquire vcpu->mutex to do KVM_GET_MP_STATE,
i.e. vcpu_mp_state_get() will block until KVM_RUN completes.  Nothing guarantees
vcpu0_running will be set from main() before this gets to the while-loop, so
vcpu_mp_state_get() racing with KVM_RUN is already possible.

> +
> +		vcpu_mp_state_get(vcpu0, &vcpu0_mp_state);
> +		if (vcpu0_mp_state.mp_state == KVM_MP_STATE_HALTED)
> +			break;
> +	}
> +
> +	printf("vcpu1: Kicked out vcpu0 and ensure vcpu0 is halted\n");
> +
> +	/* Use save_restore_vm() to simulate a VM migration. */
> +	save_restore_vm(vm);
> +
> +	printf("vcpu1: Finished save and restore vm.\n");

Uber nit, be consistent on whether or not the test uses punctionation.

> +	vcpu1 = vm_vcpu_add(vm, VCPU_ID1, l1_vcpu1_guest_code);
> +	vcpu_args_set(vcpu1, 1, vmx->posted_intr_desc);
> +
> +	/* Start an L1 in vcpu1 and send a posted interrupt to halted L2 in vcpu0. */
> +	for (;;) {
> +		run = vcpu1->run;
> +		vcpu_run(vcpu1);
> +
> +		TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
> +			    "vcpu1: Got exit_reason other than KVM_EXIT_IO: %u (%s)\n",
> +			    run->exit_reason,
> +			    exit_reason_str(run->exit_reason));
> +
> +		switch (get_ucall(vcpu1, &uc)) {
> +		case UCALL_ABORT:
> +			TEST_FAIL("%s", (const char *)uc.args[0]);
> +			/* NOT REACHED */

			REPORT_GUEST_ASSERT(...)

> +		case UCALL_DONE:
> +			printf("vcpu1: Successfully send a posted interrupt to vcpu0\n");
> +			goto done;
> +		default:
> +			TEST_FAIL("vcpu1: Unknown ucall %lu", uc.cmd);
> +		}
> +	}
> +
> +done:
> +	/*
> +	 * Allow vcpu0 resume execution from L0 userspace and check if the
> +	 * posted interrupt get executed.
> +	 */
> +	WRITE_ONCE(vcpu0_can_run, true);
> +	sleep(1);

What guarantees that sleep(1) is sufficient for vCPU to get back into the guest?
This might be a good candidate for a sempahore?

> +	TEST_ASSERT(READ_ONCE(pi_executed),
> +		    "vcpu0 did not execute the posted interrupt.\n");
> +
> +	return NULL;
> +}

...

> +       TEST_REQUIRE(kvm_has_cap(KVM_CAP_NESTED_STATE));
> +       TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_VMX));

This also requires APICv / posted interrupts, and x2APIC if that hardcoded behavior
is kept.

> +	for (;;) {
> +		struct kvm_run *run = vcpu0->run;
> +		struct ucall uc;
> +		int rc;
> +
> +		while (!READ_ONCE(vcpu0_can_run))
> +			;
> +
> +		WRITE_ONCE(vcpu0_running, true);
> +
> +		rc = __vcpu_run(vcpu0);
> +
> +		vcpu0->run->immediate_exit = 0;

Why?  vcpu_run_complete_io() is the only thing that sets immediate_exit, and it
clears the flag after doing __vcpu_run().

> +
> +		/*
> +		 * When vCPU is kicked out by a signal, ensure a consistent vCPU
> +		 * state to prepare for migration before setting the
> +		 * vcpu_running flag to false.
> +		 */
> +		if (rc == -1 && run->exit_reason == KVM_EXIT_INTR) {
> +			vcpu_run_complete_io(vcpu0);
> +
> +			WRITE_ONCE(vcpu0_running, false);
> +
> +			continue;
> +		}
> +
> +		WRITE_ONCE(vcpu0_running, false);
> +
> +		if (run->io.port == PORT_L0_EXIT) {

One of the motivations for using GUEST_SYNC() instead of PORT_L0_EXIT is that
this test could (very theoretically) get a false pass if GUEST_DONE() is reached
before PORT_L0_EXIT is encountered.

> +			printf("vcpu0: Executed the posted interrupt\n");
> +			WRITE_ONCE(pi_executed, true);
> +			continue;
> +		}
> +
> +		switch (get_ucall(vcpu0, &uc)) {
> +		case UCALL_ABORT:
> +			TEST_FAIL("%s", (const char *)uc.args[0]);

			REPORT_GUEST_ASSERT(...)

> +			/* NOT REACHED */
> +		case UCALL_DONE:
> +			goto done;

Just break?

> +		default:
> +			TEST_FAIL("vcpu0: Unknown ucall %lu", uc.cmd);
> +		}
> +	}
> +
> +done:
> +	kvm_vm_free(vm);
> +	return 0;
> +}
> -- 
> 2.37.2.672.g94769d06f0-goog
>
diff mbox series

Patch

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index d625a3f83780..749b2be5b23c 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -51,6 +51,7 @@ 
 /x86_64/vmx_exception_with_invalid_guest_state
 /x86_64/vmx_invalid_nested_guest_state
 /x86_64/vmx_msrs_test
+/x86_64/vmx_migrate_pi_pending
 /x86_64/vmx_preemption_timer_test
 /x86_64/vmx_set_nested_state_test
 /x86_64/vmx_tsc_adjust_test
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 4c122f1b1737..5e830c65c068 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -110,6 +110,7 @@  TEST_GEN_PROGS_x86_64 += x86_64/vmx_dirty_log_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_exception_with_invalid_guest_state
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_msrs_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_invalid_nested_guest_state
+TEST_GEN_PROGS_x86_64 += x86_64/vmx_migrate_pi_pending
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_set_nested_state_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_tsc_adjust_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_nested_tsc_scaling_test
diff --git a/tools/testing/selftests/kvm/x86_64/vmx_migrate_pi_pending.c b/tools/testing/selftests/kvm/x86_64/vmx_migrate_pi_pending.c
new file mode 100644
index 000000000000..6f1a9f284754
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/vmx_migrate_pi_pending.c
@@ -0,0 +1,291 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * vmx_migrate_pi_pending
+ *
+ * Copyright (C) 2022, Google, LLC.
+ *
+ * Deliver a nested posted interrupt between migration and the first
+ * KVM_RUN on the target.
+ */
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+#include "vmx.h"
+
+#include <string.h>
+#include <sys/ioctl.h>
+#include <linux/bitmap.h>
+#include <pthread.h>
+#include <signal.h>
+
+#include "kselftest.h"
+
+#define VCPU_ID0 0
+#define VCPU_ID1 1
+#define PI_ON_BIT 256
+#define PI_NV    0x42
+#define L2_INTR  0x71
+
+enum {
+	PORT_L0_EXIT = 0x2000,
+};
+
+/* The virtual machine object. */
+struct vmx_pages *vmx;
+
+static struct kvm_vm *vm;
+static struct kvm_vcpu *vcpu0;
+static struct kvm_vcpu *vcpu1;
+bool vcpu0_can_run = true;
+bool vcpu0_running;
+bool pi_executed;
+pthread_t pthread_cpu0;
+pthread_t pthread_cpu1;
+
+static void vcpu0_ipi_handler(struct ex_regs *regs)
+{
+	 asm volatile("inb %%dx, %%al"
+		      : : [port] "d" (PORT_L0_EXIT) : "rax");
+	asm volatile("vmcall");
+}
+
+static void l2_vcpu0_guest_code(void)
+{
+	asm volatile("cli");
+	asm volatile("sti; nop; hlt");
+}
+
+static void l1_vcpu0_guest_code(struct vmx_pages *vmx_pages)
+{
+#define L2_GUEST_STACK_SIZE 64
+	unsigned long l2_vcpu0_guest_stack[L2_GUEST_STACK_SIZE];
+	uint32_t control;
+
+	x2apic_enable();
+
+	GUEST_ASSERT(prepare_for_vmx_operation(vmx_pages));
+	GUEST_ASSERT(load_vmcs(vmx_pages));
+
+	/* Prepare the VMCS for L2 execution. */
+	prepare_vmcs(vmx_pages, l2_vcpu0_guest_code,
+		     &l2_vcpu0_guest_stack[L2_GUEST_STACK_SIZE]);
+	control = vmreadz(PIN_BASED_VM_EXEC_CONTROL);
+	control |= (PIN_BASED_EXT_INTR_MASK |
+		    PIN_BASED_POSTED_INTR);
+	vmwrite(PIN_BASED_VM_EXEC_CONTROL, control);
+	control = vmreadz(CPU_BASED_VM_EXEC_CONTROL);
+	control |= (CPU_BASED_TPR_SHADOW |
+		    CPU_BASED_ACTIVATE_SECONDARY_CONTROLS);
+	vmwrite(CPU_BASED_VM_EXEC_CONTROL, control);
+	control = vmreadz(SECONDARY_VM_EXEC_CONTROL);
+	control |= (SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
+		    SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY);
+	vmwrite(SECONDARY_VM_EXEC_CONTROL, control);
+	control = vmreadz(VM_EXIT_CONTROLS);
+	control |= VM_EXIT_ACK_INTR_ON_EXIT;
+	vmwrite(VM_EXIT_CONTROLS, control);
+	vmwrite(VIRTUAL_APIC_PAGE_ADDR, vmx_pages->virtual_apic_gpa);
+	vmwrite(POSTED_INTR_DESC_ADDR, vmx_pages->posted_intr_desc_gpa);
+	vmwrite(POSTED_INTR_NV, PI_NV);
+
+	GUEST_ASSERT(!vmlaunch());
+	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
+	GUEST_ASSERT(!test_bit(PI_ON_BIT, (void *)vmx_pages->posted_intr_desc));
+	GUEST_DONE();
+}
+
+static void post_intr(u8 vector, void *pi_desc)
+{
+	set_bit(vector, pi_desc);
+	set_bit(PI_ON_BIT, pi_desc);
+}
+
+static void l1_vcpu1_guest_code(void *vcpu0_pi_desc)
+{
+	post_intr(L2_INTR, vcpu0_pi_desc);
+	x2apic_enable();
+	x2apic_write_reg(APIC_ICR, ((u64)VCPU_ID0 << 32) |
+			 APIC_DEST_PHYSICAL | APIC_DM_FIXED | PI_NV);
+	GUEST_DONE();
+}
+
+static void save_restore_vm(struct kvm_vm *vm)
+{
+	struct kvm_regs regs1 = {}, regs2 = {};
+	struct kvm_x86_state *state;
+
+	state = vcpu_save_state(vcpu0);
+	vcpu_regs_get(vcpu0, &regs1);
+
+	kvm_vm_release(vm);
+
+	/* Restore state in a new VM.  */
+	vcpu0 = vm_recreate_with_one_vcpu(vm);
+	vcpu_load_state(vcpu0, state);
+	kvm_x86_state_cleanup(state);
+
+	vcpu_regs_get(vcpu0, &regs2);
+	TEST_ASSERT(!memcmp(&regs1, &regs2, sizeof(regs2)),
+		    "vcpu0: Unexpected register values after vcpu_load_state; rdi: %lx rsi: %lx",
+		    (ulong) regs2.rdi, (ulong) regs2.rsi);
+}
+
+void *create_and_run_vcpu1(void *arg)
+{
+	struct ucall uc;
+	struct kvm_run *run;
+	struct kvm_mp_state vcpu0_mp_state;
+
+	pthread_cpu1 = pthread_self();
+
+	/* Keep trying to kick out vcpu0 until it is in halted state. */
+	for (;;) {
+		WRITE_ONCE(vcpu0_can_run, true);
+		sleep(1);
+		WRITE_ONCE(vcpu0_can_run, false);
+		pthread_kill(pthread_cpu0, SIGUSR1);
+		printf("vcpu1: Sent SIGUSR1 to vcpu0\n");
+
+		while (READ_ONCE(vcpu0_running))
+			;
+
+		vcpu_mp_state_get(vcpu0, &vcpu0_mp_state);
+		if (vcpu0_mp_state.mp_state == KVM_MP_STATE_HALTED)
+			break;
+	}
+
+	printf("vcpu1: Kicked out vcpu0 and ensure vcpu0 is halted\n");
+
+	/* Use save_restore_vm() to simulate a VM migration. */
+	save_restore_vm(vm);
+
+	printf("vcpu1: Finished save and restore vm.\n");
+	vcpu1 = vm_vcpu_add(vm, VCPU_ID1, l1_vcpu1_guest_code);
+	vcpu_args_set(vcpu1, 1, vmx->posted_intr_desc);
+
+	/* Start an L1 in vcpu1 and send a posted interrupt to halted L2 in vcpu0. */
+	for (;;) {
+		run = vcpu1->run;
+		vcpu_run(vcpu1);
+
+		TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+			    "vcpu1: Got exit_reason other than KVM_EXIT_IO: %u (%s)\n",
+			    run->exit_reason,
+			    exit_reason_str(run->exit_reason));
+
+		switch (get_ucall(vcpu1, &uc)) {
+		case UCALL_ABORT:
+			TEST_FAIL("%s", (const char *)uc.args[0]);
+			/* NOT REACHED */
+		case UCALL_DONE:
+			printf("vcpu1: Successfully send a posted interrupt to vcpu0\n");
+			goto done;
+		default:
+			TEST_FAIL("vcpu1: Unknown ucall %lu", uc.cmd);
+		}
+	}
+
+done:
+	/*
+	 * Allow vcpu0 resume execution from L0 userspace and check if the
+	 * posted interrupt get executed.
+	 */
+	WRITE_ONCE(vcpu0_can_run, true);
+	sleep(1);
+	TEST_ASSERT(READ_ONCE(pi_executed),
+		    "vcpu0 did not execute the posted interrupt.\n");
+
+	return NULL;
+}
+
+void sig_handler(int signum, siginfo_t *info, void *context)
+{
+	TEST_ASSERT(pthread_self() == pthread_cpu0,
+		    "Incorrect receiver of the signal, expect pthread_cpu0: "
+		    "%lu, but get: %lu\n", pthread_cpu0, pthread_self());
+	printf("vcpu0: Execute sighandler for signal: %d\n", signum);
+}
+
+int main(int argc, char *argv[])
+{
+	vm_vaddr_t vmx_pages_gva;
+	struct sigaction sig_action;
+	struct sigaction old_action;
+
+	memset(&sig_action, 0, sizeof(sig_action));
+	sig_action.sa_sigaction = sig_handler;
+	sig_action.sa_flags = SA_RESTART | SA_SIGINFO;
+	sigemptyset(&sig_action.sa_mask);
+	sigaction(SIGUSR1, &sig_action, &old_action);
+
+	pthread_cpu0 = pthread_self();
+	printf("vcpu0: Finish setup signal handler for SIGUSR1\n");
+
+	TEST_REQUIRE(kvm_has_cap(KVM_CAP_NESTED_STATE));
+	TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_VMX));
+
+	vm = vm_create_with_one_vcpu(&vcpu0, (void *)l1_vcpu0_guest_code);
+
+	vm_init_descriptor_tables(vm);
+	vcpu_init_descriptor_tables(vcpu0);
+	vm_install_exception_handler(vm, L2_INTR, vcpu0_ipi_handler);
+
+	/* Allocate VMX pages and shared descriptors (vmx_pages). */
+	vmx = vcpu_alloc_vmx(vm, &vmx_pages_gva);
+	prepare_virtual_apic(vmx, vm);
+	prepare_posted_intr_desc(vmx, vm);
+	vcpu_args_set(vcpu0, 1, vmx_pages_gva);
+
+	pthread_create(&pthread_cpu1, NULL, create_and_run_vcpu1, NULL);
+
+	for (;;) {
+		struct kvm_run *run = vcpu0->run;
+		struct ucall uc;
+		int rc;
+
+		while (!READ_ONCE(vcpu0_can_run))
+			;
+
+		WRITE_ONCE(vcpu0_running, true);
+
+		rc = __vcpu_run(vcpu0);
+
+		vcpu0->run->immediate_exit = 0;
+
+		/*
+		 * When vCPU is kicked out by a signal, ensure a consistent vCPU
+		 * state to prepare for migration before setting the
+		 * vcpu_running flag to false.
+		 */
+		if (rc == -1 && run->exit_reason == KVM_EXIT_INTR) {
+			vcpu_run_complete_io(vcpu0);
+
+			WRITE_ONCE(vcpu0_running, false);
+
+			continue;
+		}
+
+		WRITE_ONCE(vcpu0_running, false);
+
+		if (run->io.port == PORT_L0_EXIT) {
+			printf("vcpu0: Executed the posted interrupt\n");
+			WRITE_ONCE(pi_executed, true);
+			continue;
+		}
+
+		switch (get_ucall(vcpu0, &uc)) {
+		case UCALL_ABORT:
+			TEST_FAIL("%s", (const char *)uc.args[0]);
+			/* NOT REACHED */
+		case UCALL_DONE:
+			goto done;
+		default:
+			TEST_FAIL("vcpu0: Unknown ucall %lu", uc.cmd);
+		}
+	}
+
+done:
+	kvm_vm_free(vm);
+	return 0;
+}