mbox series

[v4,00/10] arm64: Stolen time support

Message ID 20190830084255.55113-1-steven.price@arm.com (mailing list archive)
Headers show
Series arm64: Stolen time support | expand

Message

Steven Price Aug. 30, 2019, 8:42 a.m. UTC
This series add support for paravirtualized time for arm64 guests and
KVM hosts following the specification in Arm's document DEN 0057A:

https://developer.arm.com/docs/den0057/a

It implements support for stolen time, allowing the guest to
identify time when it is forcibly not executing.

It doesn't implement support for Live Physical Time (LPT) as there are
some concerns about the overheads and approach in the above
specification, and I expect an updated version of the specification to
be released soon with just the stolen time parts.

NOTE: Patches 8 and 9 will conflict with Mark Rutland's series[1] cleaning
up the SMCCC conduit. I do feel that the addition of an _invoke() call
makes a number of call sites cleaner and it should be possible to
integrate both this and Mark's other cleanups.

[1] https://lore.kernel.org/linux-arm-kernel/20190809132245.43505-1-mark.rutland@arm.com/

Also available as a git tree:
git://linux-arm.org/linux-sp.git stolen_time/v4

Changes from v3:
https://lore.kernel.org/lkml/20190821153656.33429-1-steven.price@arm.com/
 * There's no longer a PV_TIME device, instead there are attributes on
   the VCPU. This allows the stolen time structures to be places
   arbitrarily by user space (subject to 64 byte alignment).
 * Split documentation between information on the hypercalls and the
   attributes on the VCPU
 * Fixed the type of SMCCC functions to return long not int

Changes from v2:
https://lore.kernel.org/lkml/20190819140436.12207-1-steven.price@arm.com/
 * Switched from using gfn_to_hva_cache to a new macro kvm_put_guest()
   that can provide the single-copy atomicity required (on arm64). This
   macro is added in patch 4.
 * Tidied up the locking for kvm_update_stolen_time().
   pagefault_disable() was unnecessary and the caller didn't need to
   take kvm->srcu as the function does it itself.
 * Removed struct kvm_arch_pvtime from the arm implementation, replaced
   instead with inline static functions which are empty for arm.
 * Fixed a few checkpatch --strict warnings.

Changes from v1:
https://lore.kernel.org/lkml/20190802145017.42543-1-steven.price@arm.com/
 * Host kernel no longer allocates the stolen time structure, instead it
   is allocated by user space. This means the save/restore functionality
   can be removed.
 * Refactored the code so arm has stub implementations and to avoid
   initcall
 * Rebased to pick up Documentation/{virt->virtual} change
 * Bunch of typo fixes

Christoffer Dall (1):
  KVM: arm/arm64: Factor out hypercall handling from PSCI code

Steven Price (9):
  KVM: arm64: Document PV-time interface
  KVM: arm64: Implement PV_FEATURES call
  KVM: Implement kvm_put_guest()
  KVM: arm64: Support stolen time reporting via shared structure
  KVM: Allow kvm_device_ops to be const
  KVM: arm64: Provide VCPU attributes for stolen time
  arm/arm64: Provide a wrapper for SMCCC 1.1 calls
  arm/arm64: Make use of the SMCCC 1.1 wrapper
  arm64: Retrieve stolen time as paravirtualized guest

 Documentation/virt/kvm/arm/pvtime.txt   |  64 ++++++++++
 Documentation/virt/kvm/devices/vcpu.txt |  14 +++
 arch/arm/include/asm/kvm_host.h         |  26 +++++
 arch/arm/kvm/Makefile                   |   2 +-
 arch/arm/kvm/handle_exit.c              |   2 +-
 arch/arm/mm/proc-v7-bugs.c              |  13 +--
 arch/arm64/include/asm/kvm_host.h       |  30 ++++-
 arch/arm64/include/asm/paravirt.h       |   9 +-
 arch/arm64/include/asm/pvclock-abi.h    |  17 +++
 arch/arm64/include/uapi/asm/kvm.h       |   2 +
 arch/arm64/kernel/cpu_errata.c          |  80 +++++--------
 arch/arm64/kernel/paravirt.c            | 148 ++++++++++++++++++++++++
 arch/arm64/kernel/time.c                |   3 +
 arch/arm64/kvm/Kconfig                  |   1 +
 arch/arm64/kvm/Makefile                 |   2 +
 arch/arm64/kvm/guest.c                  |   9 ++
 arch/arm64/kvm/handle_exit.c            |   4 +-
 include/kvm/arm_hypercalls.h            |  43 +++++++
 include/kvm/arm_psci.h                  |   2 +-
 include/linux/arm-smccc.h               |  58 ++++++++++
 include/linux/cpuhotplug.h              |   1 +
 include/linux/kvm_host.h                |  26 ++++-
 include/linux/kvm_types.h               |   2 +
 include/uapi/linux/kvm.h                |   2 +
 virt/kvm/arm/arm.c                      |  11 ++
 virt/kvm/arm/hypercalls.c               |  68 +++++++++++
 virt/kvm/arm/psci.c                     |  84 +-------------
 virt/kvm/arm/pvtime.c                   | 124 ++++++++++++++++++++
 virt/kvm/kvm_main.c                     |   6 +-
 29 files changed, 699 insertions(+), 154 deletions(-)
 create mode 100644 Documentation/virt/kvm/arm/pvtime.txt
 create mode 100644 arch/arm64/include/asm/pvclock-abi.h
 create mode 100644 include/kvm/arm_hypercalls.h
 create mode 100644 virt/kvm/arm/hypercalls.c
 create mode 100644 virt/kvm/arm/pvtime.c

Comments

Andrew Jones Sept. 3, 2019, 8:03 a.m. UTC | #1
On Fri, Aug 30, 2019 at 09:42:45AM +0100, Steven Price wrote:
> This series add support for paravirtualized time for arm64 guests and
> KVM hosts following the specification in Arm's document DEN 0057A:
> 
> https://developer.arm.com/docs/den0057/a
> 
> It implements support for stolen time, allowing the guest to
> identify time when it is forcibly not executing.
> 
> It doesn't implement support for Live Physical Time (LPT) as there are
> some concerns about the overheads and approach in the above
> specification, and I expect an updated version of the specification to
> be released soon with just the stolen time parts.
> 
> NOTE: Patches 8 and 9 will conflict with Mark Rutland's series[1] cleaning
> up the SMCCC conduit. I do feel that the addition of an _invoke() call
> makes a number of call sites cleaner and it should be possible to
> integrate both this and Mark's other cleanups.
> 
> [1] https://lore.kernel.org/linux-arm-kernel/20190809132245.43505-1-mark.rutland@arm.com/
> 
> Also available as a git tree:
> git://linux-arm.org/linux-sp.git stolen_time/v4
> 
> Changes from v3:
> https://lore.kernel.org/lkml/20190821153656.33429-1-steven.price@arm.com/
>  * There's no longer a PV_TIME device, instead there are attributes on
>    the VCPU. This allows the stolen time structures to be places
>    arbitrarily by user space (subject to 64 byte alignment).
>  * Split documentation between information on the hypercalls and the
>    attributes on the VCPU
>  * Fixed the type of SMCCC functions to return long not int
> 
> Changes from v2:
> https://lore.kernel.org/lkml/20190819140436.12207-1-steven.price@arm.com/
>  * Switched from using gfn_to_hva_cache to a new macro kvm_put_guest()
>    that can provide the single-copy atomicity required (on arm64). This
>    macro is added in patch 4.
>  * Tidied up the locking for kvm_update_stolen_time().
>    pagefault_disable() was unnecessary and the caller didn't need to
>    take kvm->srcu as the function does it itself.
>  * Removed struct kvm_arch_pvtime from the arm implementation, replaced
>    instead with inline static functions which are empty for arm.
>  * Fixed a few checkpatch --strict warnings.
> 
> Changes from v1:
> https://lore.kernel.org/lkml/20190802145017.42543-1-steven.price@arm.com/
>  * Host kernel no longer allocates the stolen time structure, instead it
>    is allocated by user space. This means the save/restore functionality
>    can be removed.
>  * Refactored the code so arm has stub implementations and to avoid
>    initcall
>  * Rebased to pick up Documentation/{virt->virtual} change
>  * Bunch of typo fixes
> 
> Christoffer Dall (1):
>   KVM: arm/arm64: Factor out hypercall handling from PSCI code
> 
> Steven Price (9):
>   KVM: arm64: Document PV-time interface
>   KVM: arm64: Implement PV_FEATURES call
>   KVM: Implement kvm_put_guest()
>   KVM: arm64: Support stolen time reporting via shared structure
>   KVM: Allow kvm_device_ops to be const
>   KVM: arm64: Provide VCPU attributes for stolen time
>   arm/arm64: Provide a wrapper for SMCCC 1.1 calls
>   arm/arm64: Make use of the SMCCC 1.1 wrapper
>   arm64: Retrieve stolen time as paravirtualized guest
> 
>  Documentation/virt/kvm/arm/pvtime.txt   |  64 ++++++++++
>  Documentation/virt/kvm/devices/vcpu.txt |  14 +++
>  arch/arm/include/asm/kvm_host.h         |  26 +++++
>  arch/arm/kvm/Makefile                   |   2 +-
>  arch/arm/kvm/handle_exit.c              |   2 +-
>  arch/arm/mm/proc-v7-bugs.c              |  13 +--
>  arch/arm64/include/asm/kvm_host.h       |  30 ++++-
>  arch/arm64/include/asm/paravirt.h       |   9 +-
>  arch/arm64/include/asm/pvclock-abi.h    |  17 +++
>  arch/arm64/include/uapi/asm/kvm.h       |   2 +
>  arch/arm64/kernel/cpu_errata.c          |  80 +++++--------
>  arch/arm64/kernel/paravirt.c            | 148 ++++++++++++++++++++++++
>  arch/arm64/kernel/time.c                |   3 +
>  arch/arm64/kvm/Kconfig                  |   1 +
>  arch/arm64/kvm/Makefile                 |   2 +
>  arch/arm64/kvm/guest.c                  |   9 ++
>  arch/arm64/kvm/handle_exit.c            |   4 +-
>  include/kvm/arm_hypercalls.h            |  43 +++++++
>  include/kvm/arm_psci.h                  |   2 +-
>  include/linux/arm-smccc.h               |  58 ++++++++++
>  include/linux/cpuhotplug.h              |   1 +
>  include/linux/kvm_host.h                |  26 ++++-
>  include/linux/kvm_types.h               |   2 +
>  include/uapi/linux/kvm.h                |   2 +
>  virt/kvm/arm/arm.c                      |  11 ++
>  virt/kvm/arm/hypercalls.c               |  68 +++++++++++
>  virt/kvm/arm/psci.c                     |  84 +-------------
>  virt/kvm/arm/pvtime.c                   | 124 ++++++++++++++++++++
>  virt/kvm/kvm_main.c                     |   6 +-
>  29 files changed, 699 insertions(+), 154 deletions(-)
>  create mode 100644 Documentation/virt/kvm/arm/pvtime.txt
>  create mode 100644 arch/arm64/include/asm/pvclock-abi.h
>  create mode 100644 include/kvm/arm_hypercalls.h
>  create mode 100644 virt/kvm/arm/hypercalls.c
>  create mode 100644 virt/kvm/arm/pvtime.c
> 
> -- 
> 2.20.1
>

Hi Steven,

I had some fun testing this series with the KVM selftests framework. It
looks like it works to me, so you may add

Tested-by: Andrew Jones <drjones@redhat.com>

if you like. And below is the test I came up with.

Thanks,
drew


From: Andrew Jones <drjones@redhat.com>
Date: Tue, 3 Sep 2019 03:45:08 -0400
Subject: [PATCH] selftests: kvm: aarch64 stolen-time test

Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/aarch64/stolen-time.c       | 208 ++++++++++++++++++
 2 files changed, 209 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/aarch64/stolen-time.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index ba7849751989..3151264039ad 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -28,6 +28,7 @@ TEST_GEN_PROGS_x86_64 += clear_dirty_log_test
 TEST_GEN_PROGS_x86_64 += dirty_log_test
 TEST_GEN_PROGS_x86_64 += kvm_create_max_vcpus
 
+TEST_GEN_PROGS_aarch64 += aarch64/stolen-time
 TEST_GEN_PROGS_aarch64 += clear_dirty_log_test
 TEST_GEN_PROGS_aarch64 += dirty_log_test
 TEST_GEN_PROGS_aarch64 += kvm_create_max_vcpus
diff --git a/tools/testing/selftests/kvm/aarch64/stolen-time.c b/tools/testing/selftests/kvm/aarch64/stolen-time.c
new file mode 100644
index 000000000000..36df2f6baa17
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/stolen-time.c
@@ -0,0 +1,208 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * AArch64 PV stolen time test
+ *
+ * Copyright (C) 2019, Red Hat, Inc.
+ */
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <time.h>
+#include <sched.h>
+#include <pthread.h>
+#include <sys/syscall.h>
+#include "kvm_util.h"
+
+#define ST_IPA_BASE		(1 << 30)
+#define MIN_STOLEN_TIME		200000
+
+struct st_time {
+	uint32_t rev;
+	uint32_t attr;
+	uint64_t st_time;
+};
+
+static uint64_t st_ipa_offset[4];
+static uint64_t guest_stolen_time[4];
+
+static void guest_code(void)
+{
+	struct st_time *st_time;
+	uint64_t cpu;
+	int64_t ipa;
+	int32_t ret;
+
+	asm volatile("mrs %0, mpidr_el1" : "=r" (cpu));
+	cpu &= 0x3;
+
+	asm volatile(
+		"mov	x0, %1\n"
+		"mov	x1, %2\n"
+		"hvc	#0\n"
+		"mov	%0, x0\n"
+	: "=r" (ret) : "r" (0x80000001), "r" (0xc5000020) :
+	  "x0", "x1", "x2", "x3");
+
+	GUEST_ASSERT(ret == 0);
+
+	asm volatile(
+		"mov	x0, %1\n"
+		"mov	x1, %2\n"
+		"hvc	#0\n"
+		"mov	%0, x0\n"
+	: "=r" (ret) : "r" (0xc5000020), "r" (0xc5000022) :
+	  "x0", "x1", "x2", "x3");
+
+	GUEST_ASSERT(ret == 0);
+
+	asm volatile(
+		"mov	x0, %1\n"
+		"hvc	#0\n"
+		"mov	%0, x0\n"
+	: "=r" (ipa) : "r" (0xc5000022) :
+	  "x0", "x1", "x2", "x3");
+
+	GUEST_ASSERT(ipa == ST_IPA_BASE + st_ipa_offset[cpu]);
+
+	st_time = (struct st_time *)ipa;
+	GUEST_ASSERT(st_time->rev == 0);
+	GUEST_ASSERT(st_time->attr == 0);
+
+	guest_stolen_time[cpu] = st_time->st_time;
+	GUEST_SYNC(0);
+
+	guest_stolen_time[cpu] = st_time->st_time;
+	GUEST_DONE();
+}
+
+static long get_run_delay(void)
+{
+	char path[64];
+	long val[2];
+	FILE *fp;
+
+	sprintf(path, "/proc/%ld/schedstat", syscall(SYS_gettid));
+	fp = fopen(path, "r");
+	fscanf(fp, "%ld %ld ", &val[0], &val[1]);
+	fclose(fp);
+
+	return val[1];
+}
+
+static void *steal_time(void *arg)
+{
+	uint64_t nsecs_per_sec = 1000000000ul;
+	uint64_t sec, nsec;
+	struct timespec ts;
+
+	clock_gettime(CLOCK_MONOTONIC, &ts);
+	sec = ts.tv_sec;
+	nsec = ts.tv_nsec + MIN_STOLEN_TIME;
+	if (nsec > nsecs_per_sec) {
+		sec += 1;
+		nsec -= nsecs_per_sec;
+	}
+
+	while (1) {
+		clock_gettime(CLOCK_MONOTONIC, &ts);
+		if (ts.tv_sec > sec || (ts.tv_sec == sec && ts.tv_nsec >= nsec))
+			break;
+	}
+
+	return NULL;
+}
+
+static void run_vcpu(struct kvm_vm *vm, uint32_t vcpuid)
+{
+	struct ucall uc;
+
+	vcpu_ioctl(vm, vcpuid, KVM_RUN, NULL);
+
+	switch (get_ucall(vm, vcpuid, &uc)) {
+	case UCALL_SYNC:
+	case UCALL_DONE:
+		break;
+	case UCALL_ABORT:
+		TEST_ASSERT(false, "%s at %s:%d", (const char *)uc.args[0],
+			    __FILE__, uc.args[1]);
+	default:
+		TEST_ASSERT(false, "Unexpected exit: %s",
+			    exit_reason_str(vcpu_state(vm, vcpuid)->exit_reason));
+	}
+}
+
+int main(int ac, char **av)
+{
+	struct kvm_device_attr dev = {
+		.group = KVM_ARM_VCPU_PVTIME_CTRL,
+		.attr = KVM_ARM_VCPU_PVTIME_SET_IPA,
+	};
+	uint64_t pvtime_memslot_size;
+	struct kvm_vm *vm;
+	pthread_attr_t attr;
+	pthread_t thread;
+	cpu_set_t cpuset;
+	long stolen_time;
+	int i;
+
+	CPU_ZERO(&cpuset);
+	CPU_SET(0, &cpuset);
+	pthread_attr_init(&attr);
+	pthread_attr_setaffinity_np(&attr, sizeof(cpu_set_t), &cpuset);
+	pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
+
+	pvtime_memslot_size = 64 * 1024; /* one maximum-sized host page */
+
+	/* create a one-vcpu guest and the pvtime memslot */
+	vm = vm_create_default(0, 0, guest_code);
+	vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, ST_IPA_BASE, 1,
+				    16 /* vm uses 4k pages */, 0);
+	virt_map(vm, ST_IPA_BASE, ST_IPA_BASE, pvtime_memslot_size, 0);
+	ucall_init(vm, UCALL_MMIO, NULL);
+
+	/* add 3 more vcpus */
+	for (i = 1; i < 4; ++i) {
+		vm_vcpu_add_default(vm, i, guest_code);
+		st_ipa_offset[i] = i * 64;
+		sync_global_to_guest(vm, st_ipa_offset[i]);
+	}
+
+	/* add pvtime to each vcpu */
+	for (i = 0; i < 4; ++i) {
+		uint64_t st_ipa = ST_IPA_BASE + st_ipa_offset[i];
+		dev.addr = (uint64_t)&st_ipa;
+		vcpu_ioctl(vm, i, KVM_HAS_DEVICE_ATTR, &dev);
+		vcpu_ioctl(vm, i, KVM_SET_DEVICE_ATTR, &dev);
+	}
+
+	/* run the tests on each vcpu */
+	for (i = 0; i < 4; ++i) {
+		/* first vcpu run */
+		run_vcpu(vm, i);
+		sync_global_from_guest(vm, guest_stolen_time[i]);
+		TEST_ASSERT(guest_stolen_time[i] == 0, "Expected stolen_time = 0");
+
+		/* steal time from the vcpu */
+		stolen_time = get_run_delay();
+		pthread_create(&thread, &attr, steal_time, NULL);
+		pthread_yield();
+		pthread_join(thread, NULL);
+		stolen_time = get_run_delay() - stolen_time;
+		TEST_ASSERT(stolen_time >= MIN_STOLEN_TIME,
+			    "Expected stolen time >= %ld, got %ld",
+			    MIN_STOLEN_TIME, stolen_time);
+
+		/* run vcpu again and check the stolen time */
+		run_vcpu(vm, i);
+		sync_global_from_guest(vm, guest_stolen_time[i]);
+		TEST_ASSERT(guest_stolen_time[i] >= stolen_time,
+			    "Expected stolen_time >= %ld, got %ld",
+			    stolen_time, guest_stolen_time[i]);
+
+		printf("CPU%d: %ld", i, guest_stolen_time[i]);
+		if (stolen_time == guest_stolen_time[i])
+			printf(" (BONUS: guest stolen_time even exactly matches run_delay)");
+		printf("\n");
+	}
+
+	return 0;
+}
Andrew Jones Sept. 3, 2019, 8:49 a.m. UTC | #2
On Tue, Sep 03, 2019 at 10:03:48AM +0200, Andrew Jones wrote:
> Hi Steven,
> 
> I had some fun testing this series with the KVM selftests framework. It
> looks like it works to me, so you may add
> 
> Tested-by: Andrew Jones <drjones@redhat.com>
>

Actually, I probably shouldn't be quite so generous with this tag yet,
because I haven't yet tested the guest-side changes. To do that I'll
need to start prototyping something for QEMU. I need to finish some other
stuff first, but then I can do that.

Thanks,
drew
Steven Price Sept. 4, 2019, 4:02 p.m. UTC | #3
On 03/09/2019 09:49, Andrew Jones wrote:
> On Tue, Sep 03, 2019 at 10:03:48AM +0200, Andrew Jones wrote:
>> Hi Steven,
>>
>> I had some fun testing this series with the KVM selftests framework. It
>> looks like it works to me, so you may add
>>
>> Tested-by: Andrew Jones <drjones@redhat.com>
>>
> 
> Actually, I probably shouldn't be quite so generous with this tag yet,
> because I haven't yet tested the guest-side changes. To do that I'll
> need to start prototyping something for QEMU. I need to finish some other
> stuff first, but then I can do that.

Thanks for the testing, I'll wait for your other testing before adding
your Tested-by tag.

Steve