[RFC,v7,51/78] KVM: introspection: add KVMI_VCPU_PAUSE

Message ID	20200207181636.1065-52-alazar@bitdefender.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=xIo7=33=vger.kernel.org=kvm-owner@kernel.org> From: =?utf-8?q?Adalbert_Laz=C4=83r?= <alazar@bitdefender.com> To: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, Paolo Bonzini <pbonzini@redhat.com>, Sean Christopherson <sean.j.christopherson@intel.com>, =?utf-8?q?Adalbert_L?= =?utf-8?q?az=C4=83r?= <alazar@bitdefender.com> Subject: [RFC PATCH v7 51/78] KVM: introspection: add KVMI_VCPU_PAUSE Date: Fri, 7 Feb 2020 20:16:09 +0200 Message-Id: <20200207181636.1065-52-alazar@bitdefender.com> In-Reply-To: <20200207181636.1065-1-alazar@bitdefender.com> References: <20200207181636.1065-1-alazar@bitdefender.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: kvm-owner@vger.kernel.org Precedence: bulk
Series	VM introspection \| expand [RFC,v7,00/78] VM introspection [RFC,v7,01/78] sched/swait: add swait_event_killable_exclusive() [RFC,v7,02/78] export kill_pid_info() [RFC,v7,03/78] KVM: add new error codes for VM introspection [RFC,v7,04/78] KVM: add kvm_vcpu_kick_and_wait() [RFC,v7,05/78] KVM: add kvm_get_max_gfn() [RFC,v7,06/78] KVM: doc: fix the hypercall numbering [RFC,v7,07/78] KVM: x86: add kvm_arch_vcpu_get_regs() and kvm_arch_vcpu_get_sregs() [RFC,v7,08/78] KVM: x86: add kvm_arch_vcpu_set_regs() [RFC,v7,09/78] KVM: x86: avoid injecting #PF when emulate the VMCALL instruction [RFC,v7,10/78] KVM: x86: add .bp_intercepted() to struct kvm_x86_ops [RFC,v7,11/78] KVM: x86: add .control_cr3_intercept() to struct kvm_x86_ops [RFC,v7,12/78] KVM: x86: add .cr3_write_intercepted() [RFC,v7,13/78] KVM: x86: add .control_desc_intercept() [RFC,v7,14/78] KVM: x86: add .desc_intercepted() [RFC,v7,15/78] KVM: x86: export .msr_write_intercepted() [RFC,v7,16/78] KVM: x86: use MSR_TYPE_R, MSR_TYPE_W and MSR_TYPE_RW with AMD code too [RFC,v7,17/78] KVM: svm: pass struct kvm_vcpu to set_msr_interception() [RFC,v7,18/78] KVM: vmx: pass struct kvm_vcpu to the intercept msr related functions [RFC,v7,19/78] KVM: x86: add .control_msr_intercept() [RFC,v7,20/78] KVM: x86: vmx: use a symbolic constant when checking the exit qualifications [RFC,v7,21/78] KVM: x86: save the error code during EPT/NPF exits handling [RFC,v7,22/78] KVM: x86: add .fault_gla() [RFC,v7,23/78] KVM: x86: add .spt_fault() [RFC,v7,24/78] KVM: x86: add .gpt_translation_fault() [RFC,v7,25/78] KVM: x86: add .control_singlestep() [RFC,v7,26/78] KVM: x86: export kvm_arch_vcpu_set_guest_debug() [RFC,v7,27/78] KVM: x86: extend kvm_mmu_gva_to_gpa_system() with the 'access' parameter [RFC,v7,28/78] KVM: x86: export kvm_inject_pending_exception() [RFC,v7,29/78] KVM: x86: export kvm_vcpu_ioctl_x86_get_xsave() [RFC,v7,30/78] KVM: x86: page track: provide all page tracking hooks with the guest virtual address [RFC,v7,31/78] KVM: x86: page track: add track_create_slot() callback [RFC,v7,32/78] KVM: x86: page_track: add support for preread, prewrite and preexec [RFC,v7,33/78] KVM: x86: wire in the preread/prewrite/preexec page trackers [RFC,v7,34/78] KVM: x86: intercept the write access on sidt and other emulated instructions [RFC,v7,35/78] KVM: x86: disable gpa_available optimization for fetch and page-walk NPF/EPT violati… [RFC,v7,36/78] KVM: introduce VM introspection [RFC,v7,37/78] KVM: introspection: add hook/unhook ioctls [RFC,v7,38/78] KVM: introspection: add permission access ioctls [RFC,v7,39/78] KVM: introspection: add the read/dispatch message function [RFC,v7,40/78] KVM: introspection: add KVMI_GET_VERSION [RFC,v7,41/78] KVM: introspection: add KVMI_VM_CHECK_COMMAND and KVMI_VM_CHECK_EVENT [RFC,v7,42/78] KVM: introspection: add KVMI_VM_GET_INFO [RFC,v7,43/78] KVM: introspection: add KVMI_EVENT_UNHOOK [RFC,v7,44/78] KVM: introspection: add KVMI_VM_CONTROL_EVENTS [RFC,v7,45/78] KVM: introspection: add KVMI_VM_READ_PHYSICAL/KVMI_VM_WRITE_PHYSICAL [RFC,v7,46/78] KVM: introspection: add vCPU related data [RFC,v7,47/78] KVM: introspection: add a jobs list to every introspected vCPU [RFC,v7,48/78] KVM: introspection: handle vCPU introspection requests [RFC,v7,49/78] KVM: introspection: handle vCPU commands [RFC,v7,50/78] KVM: introspection: add KVMI_VCPU_GET_INFO [RFC,v7,51/78] KVM: introspection: add KVMI_VCPU_PAUSE [RFC,v7,52/78] KVM: introspection: add KVMI_EVENT_PAUSE_VCPU [RFC,v7,53/78] KVM: introspection: add KVMI_VCPU_CONTROL_EVENTS [RFC,v7,54/78] KVM: introspection: add KVMI_VCPU_GET_REGISTERS [RFC,v7,55/78] KVM: introspection: add KVMI_VCPU_SET_REGISTERS [RFC,v7,56/78] KVM: introspection: add KVMI_VCPU_GET_CPUID [RFC,v7,57/78] KVM: introspection: add KVMI_EVENT_HYPERCALL [RFC,v7,58/78] KVM: introspection: add KVMI_EVENT_BREAKPOINT [RFC,v7,59/78] KVM: introspection: restore the state of #BP interception on unhook [RFC,v7,60/78] KVM: introspection: add KVMI_VCPU_CONTROL_CR and KVMI_EVENT_CR [RFC,v7,61/78] KVM: introspection: restore the state of CR3 interception on unhook [RFC,v7,62/78] KVM: introspection: add KVMI_VCPU_INJECT_EXCEPTION + KVMI_EVENT_TRAP [RFC,v7,63/78] KVM: introspection: add KVMI_VM_GET_MAX_GFN [RFC,v7,64/78] KVM: introspection: add KVMI_EVENT_XSETBV [RFC,v7,65/78] KVM: introspection: add KVMI_VCPU_GET_XSAVE [RFC,v7,66/78] KVM: introspection: add KVMI_VCPU_GET_MTRR_TYPE [RFC,v7,67/78] KVM: introspection: add KVMI_EVENT_DESCRIPTOR [RFC,v7,68/78] KVM: introspection: restore the state of descriptor interception on unhook [RFC,v7,69/78] KVM: introspection: add KVMI_VCPU_CONTROL_MSR and KVMI_EVENT_MSR [RFC,v7,70/78] KVM: introspection: restore the state of MSR interception on unhook [RFC,v7,71/78] KVM: introspection: add KVMI_VM_SET_PAGE_ACCESS [RFC,v7,72/78] KVM: introspection: add KVMI_EVENT_PF [RFC,v7,73/78] KVM: introspection: extend KVMI_GET_VERSION with struct kvmi_features [RFC,v7,74/78] KVM: introspection: add KVMI_VCPU_CONTROL_SINGLESTEP [RFC,v7,75/78] KVM: introspection: add KVMI_EVENT_SINGLESTEP [RFC,v7,76/78] KVM: introspection: add KVMI_VCPU_TRANSLATE_GVA [RFC,v7,77/78] KVM: introspection: emulate a guest page table walk on SPT violations due to A/D bit… [RFC,v7,78/78] KVM: x86: call the page tracking code on emulation failure

diff --git a/Documentation/virt/kvm/kvmi.rst b/Documentation/virt/kvm/kvmi.rst index 8eb0006349d6..ba01b9a249a2 100644 --- a/Documentation/virt/kvm/kvmi.rst +++ b/Documentation/virt/kvm/kvmi.rst @@ -465,12 +465,52 @@ Returns the TSC frequency (in HZ) for the specified vCPU if available * -KVM_EINVAL - padding is not zero * -KVM_EAGAIN - the selected vCPU can't be introspected yet +9. KVMI_VCPU_PAUSE +------------------ + +:Architecture: all +:Versions: >= 1 +:Parameters: + + struct kvmi_vcpu_hdr; + struct kvmi_vcpu_pause { + __u8 wait; + __u8 padding1; + __u16 padding2; + __u32 padding3; + }; + +:Returns: + +:: + + struct kvmi_error_code; + +Kicks the vCPU from guest. + +If `wait` is 1, the command will wait for vCPU to acknowledge the IPI. + +The vCPU will handle the pending commands/events and send the +*KVMI_EVENT_PAUSE_VCPU* event (one for every successful *KVMI_VCPU_PAUSE* +command) before returning to guest. + +The socket will be closed if the *KVMI_EVENT_PAUSE_VCPU* event is disallowed. +Use *KVMI_VM_CHECK_EVENT* first. + +:Errors: + +* -KVM_EINVAL - the selected vCPU is invalid +* -KVM_EINVAL - padding is not zero +* -KVM_EAGAIN - the selected vCPU can't be introspected yet +* -KVM_EBUSY - the selected vCPU has too many queued *KVMI_EVENT_PAUSE_VCPU* events + + Events ====== All introspection events (VM or vCPU related) are sent using the *KVMI_EVENT* message id. No event will be sent unless -it is explicitly enabled. +it is explicitly enabled or requested (eg. *KVMI_EVENT_PAUSE_VCPU*). The *KVMI_EVENT_UNHOOK* event doesn't have a reply and share the kvmi_event structure, for consistency with the vCPU events. @@ -529,3 +569,27 @@ the guest (see **Unhooking**) and the introspection has been enabled for this event (see **KVMI_VM_CONTROL_EVENTS**). The introspection tool has a chance to unhook and close the KVMI channel (signaling that the operation can proceed). + +1. KVMI_EVENT_PAUSE_VCPU +------------------------ + +:Architectures: all +:Versions: >= 1 +:Actions: CONTINUE, CRASH +:Parameters: + +:: + + struct kvmi_event; + +:Returns: + +:: + + struct kvmi_vcpu_hdr; + struct kvmi_event_reply; + +This event is sent in response to a *KVMI_VCPU_PAUSE* command. + +This event has a low priority. It will be sent after any other vCPU +introspection event and when no vCPU introspection command is queued. diff --git a/include/linux/kvmi_host.h b/include/linux/kvmi_host.h index 6a0fb481b192..988927c29bf5 100644 --- a/include/linux/kvmi_host.h +++ b/include/linux/kvmi_host.h @@ -23,6 +23,8 @@ struct kvm_vcpu_introspection { struct list_head job_list; spinlock_t job_lock; + + atomic_t pause_requests; }; struct kvm_introspection { diff --git a/include/uapi/linux/kvmi.h b/include/uapi/linux/kvmi.h index b36ecc0d6513..54a788c1c204 100644 --- a/include/uapi/linux/kvmi.h +++ b/include/uapi/linux/kvmi.h @@ -26,12 +26,14 @@ enum { KVMI_VM_WRITE_PHYSICAL = 8, KVMI_VCPU_GET_INFO = 9, + KVMI_VCPU_PAUSE = 10, KVMI_NUM_MESSAGES }; enum { - KVMI_EVENT_UNHOOK = 0, + KVMI_EVENT_UNHOOK = 0, + KVMI_EVENT_PAUSE_VCPU = 1, KVMI_NUM_EVENTS }; @@ -97,6 +99,13 @@ struct kvmi_vcpu_hdr { __u32 padding2; }; +struct kvmi_vcpu_pause { + __u8 wait; + __u8 padding1; + __u16 padding2; + __u32 padding3; +}; + struct kvmi_event { __u16 size; __u16 vcpu; diff --git a/tools/testing/selftests/kvm/x86_64/kvmi_test.c b/tools/testing/selftests/kvm/x86_64/kvmi_test.c index 5c55f4ce5875..942601f6177b 100644 --- a/tools/testing/selftests/kvm/x86_64/kvmi_test.c +++ b/tools/testing/selftests/kvm/x86_64/kvmi_test.c @@ -648,6 +648,36 @@ static void test_cmd_get_vcpu_info(struct kvm_vm *vm) DEBUG("tsc_speed: %llu HZ\n", rpl.tsc_speed); } +static int cmd_pause_vcpu(struct kvm_vm *vm) +{ + struct { + struct kvmi_msg_hdr hdr; + struct kvmi_vcpu_hdr vcpu_hdr; + struct kvmi_vcpu_pause cmd; + } req = {}; + __u16 vcpu_index = 0; + + req.vcpu_hdr.vcpu = vcpu_index; + + return do_command(KVMI_VCPU_PAUSE, &req.hdr, sizeof(req), + NULL, 0); +} + +static void pause_vcpu(struct kvm_vm *vm) +{ + int r; + + r = cmd_pause_vcpu(vm); + TEST_ASSERT(r == 0, + "KVMI_VCPU_PAUSE failed, error %d(%s)\n", + -r, kvm_strerror(-r)); +} + +static void test_pause(struct kvm_vm *vm) +{ + pause_vcpu(vm); +} + static void test_introspection(struct kvm_vm *vm) { setup_socket(); @@ -662,6 +692,7 @@ static void test_introspection(struct kvm_vm *vm) test_cmd_vm_control_events(); test_memory_access(vm); test_cmd_get_vcpu_info(vm); + test_pause(vm); unhook_introspection(vm); } diff --git a/virt/kvm/introspection/kvmi.c b/virt/kvm/introspection/kvmi.c index ea86512ca81e..51c090a56242 100644 --- a/virt/kvm/introspection/kvmi.c +++ b/virt/kvm/introspection/kvmi.c @@ -9,6 +9,10 @@ #include "kvmi_int.h" #include <linux/kthread.h> +enum { + MAX_PAUSE_REQUESTS = 1001 +}; + static struct kmem_cache *msg_cache; static struct kmem_cache *job_cache; @@ -65,10 +69,14 @@ void kvmi_uninit(void) kvmi_cache_destroy(); } -static void kvmi_make_request(struct kvm_vcpu *vcpu) +static void kvmi_make_request(struct kvm_vcpu *vcpu, bool wait) { kvm_make_request(KVM_REQ_INTROSPECTION, vcpu); - kvm_vcpu_kick(vcpu); + + if (wait) + kvm_vcpu_kick_and_wait(vcpu); + else + kvm_vcpu_kick(vcpu); } static int __kvmi_add_job(struct kvm_vcpu *vcpu, @@ -103,7 +111,7 @@ int kvmi_add_job(struct kvm_vcpu *vcpu, err = __kvmi_add_job(vcpu, fct, ctx, free_fct); if (!err) - kvmi_make_request(vcpu); + kvmi_make_request(vcpu, false); return err; } @@ -278,6 +286,22 @@ static int __kvmi_hook(struct kvm *kvm, return 0; } +static void kvmi_job_release_vcpu(struct kvm_vcpu *vcpu, void *ctx) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + atomic_set(&vcpui->pause_requests, 0); +} + +static void kvmi_release_vcpus(struct kvm *kvm) +{ + struct kvm_vcpu *vcpu; + int i; + + kvm_for_each_vcpu(i, vcpu, kvm) + kvmi_add_job(vcpu, kvmi_job_release_vcpu, NULL, NULL); +} + static int kvmi_recv_thread(void *arg) { struct kvm_introspection *kvmi = arg; @@ -291,6 +315,8 @@ static int kvmi_recv_thread(void *arg) */ kvmi_sock_shutdown(kvmi); + kvmi_release_vcpus(kvmi->kvm); + kvmi_put(kvmi->kvm); return 0; } @@ -676,15 +702,45 @@ void kvmi_run_jobs(struct kvm_vcpu *vcpu) } } +static void kvmi_vcpu_pause_event(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + atomic_dec(&vcpui->pause_requests); + /* to be implemented */ +} + void kvmi_handle_requests(struct kvm_vcpu *vcpu) { + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); struct kvm_introspection *kvmi; kvmi = kvmi_get(vcpu->kvm); if (!kvmi) return; - kvmi_run_jobs(vcpu); + for (;;) { + kvmi_run_jobs(vcpu); + + if (atomic_read(&vcpui->pause_requests)) + kvmi_vcpu_pause_event(vcpu); + else + break; + } kvmi_put(vcpu->kvm); } + +int kvmi_cmd_vcpu_pause(struct kvm_vcpu *vcpu, bool wait) +{ + struct kvm_vcpu_introspection *vcpui = VCPUI(vcpu); + + if (atomic_read(&vcpui->pause_requests) > MAX_PAUSE_REQUESTS) + return -KVM_EBUSY; + + atomic_inc(&vcpui->pause_requests); + + kvmi_make_request(vcpu, wait); + + return 0; +} diff --git a/virt/kvm/introspection/kvmi_int.h b/virt/kvm/introspection/kvmi_int.h index bab73fc232ec..d1d93488af1c 100644 --- a/virt/kvm/introspection/kvmi_int.h +++ b/virt/kvm/introspection/kvmi_int.h @@ -21,7 +21,9 @@ #define KVMI_KNOWN_VM_EVENTS ( \ BIT(KVMI_EVENT_UNHOOK) \ ) -#define KVMI_KNOWN_VCPU_EVENTS 0 +#define KVMI_KNOWN_VCPU_EVENTS ( \ + BIT(KVMI_EVENT_PAUSE_VCPU) \ + ) #define KVMI_KNOWN_EVENTS (KVMI_KNOWN_VM_EVENTS | KVMI_KNOWN_VCPU_EVENTS) @@ -34,6 +36,7 @@ | BIT(KVMI_VM_READ_PHYSICAL) \ | BIT(KVMI_VM_WRITE_PHYSICAL) \ | BIT(KVMI_VCPU_GET_INFO) \ + | BIT(KVMI_VCPU_PAUSE) \ ) #define KVMI(kvm) ((struct kvm_introspection *)((kvm)->kvmi)) @@ -68,6 +71,7 @@ int kvmi_cmd_read_physical(struct kvm *kvm, u64 gpa, u64 size, const struct kvmi_msg_hdr *ctx); int kvmi_cmd_write_physical(struct kvm *kvm, u64 gpa, u64 size, const void *buf); +int kvmi_cmd_vcpu_pause(struct kvm_vcpu *vcpu, bool wait); /* arch */ int kvmi_arch_cmd_vcpu_get_info(struct kvm_vcpu *vcpu, diff --git a/virt/kvm/introspection/kvmi_msg.c b/virt/kvm/introspection/kvmi_msg.c index 4e7a2ceb78da..1eae0a9a8e0a 100644 --- a/virt/kvm/introspection/kvmi_msg.c +++ b/virt/kvm/introspection/kvmi_msg.c @@ -25,6 +25,7 @@ static const char *const msg_IDs[] = { [KVMI_VM_READ_PHYSICAL] = "KVMI_VM_READ_PHYSICAL", [KVMI_VM_WRITE_PHYSICAL] = "KVMI_VM_WRITE_PHYSICAL", [KVMI_VCPU_GET_INFO] = "KVMI_VCPU_GET_INFO", + [KVMI_VCPU_PAUSE] = "KVMI_VCPU_PAUSE", }; static bool is_known_message(u16 id) @@ -291,6 +292,43 @@ static int handle_write_physical(struct kvm_introspection *kvmi, return kvmi_msg_vm_reply(kvmi, msg, ec, NULL, 0); } +/* + * We handle this vCPU command on the receiving thread to make it easier + * for userspace to implement a 'pause VM' command. Usually, this is done + * by sending one 'pause vCPU' command for every vCPU. By handling the + * command here, the userspace can consider that the VM has stopped + * once it receives the reply for the last 'pause vCPU' command. + */ +static int handle_pause_vcpu(struct kvm_introspection *kvmi, + const struct kvmi_msg_hdr *msg, + const void *_req) +{ + const struct kvmi_vcpu_pause *req = _req; + const struct kvmi_vcpu_hdr *cmd; + struct kvm_vcpu *vcpu = NULL; + int err; + + if (req->padding1 || req->padding2 || req->padding3) + return -KVM_EINVAL; + + if (!is_event_allowed(kvmi, KVMI_EVENT_PAUSE_VCPU)) + return -KVM_EPERM; + + cmd = (const struct kvmi_vcpu_hdr *) (msg + 1); + + if (invalid_vcpu_hdr(cmd)) { + err = -KVM_EINVAL; + goto reply; + } + + err = kvmi_get_vcpu(kvmi, cmd->vcpu, &vcpu); + if (!err) + err = kvmi_cmd_vcpu_pause(vcpu, req->wait == 1); + +reply: + return kvmi_msg_vm_reply(kvmi, msg, err, NULL, 0); +} + /* * These commands are executed by the receiving thread/worker. */ @@ -303,6 +341,7 @@ static int(*const msg_vm[])(struct kvm_introspection *, [KVMI_VM_GET_INFO] = handle_get_info, [KVMI_VM_READ_PHYSICAL] = handle_read_physical, [KVMI_VM_WRITE_PHYSICAL] = handle_write_physical, + [KVMI_VCPU_PAUSE] = handle_pause_vcpu, }; static int handle_get_vcpu_info(const struct kvmi_vcpu_cmd_job *job,

[RFC,v7,51/78] KVM: introspection: add KVMI_VCPU_PAUSE

Commit Message

Patch