From patchwork Thu Jun 30 18:45:32 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Corneliu ZUZU X-Patchwork-Id: 9208839 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 0B5C9607D8 for ; Thu, 30 Jun 2016 18:47:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F256D2844F for ; Thu, 30 Jun 2016 18:47:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E33DF2844E; Thu, 30 Jun 2016 18:47:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id EA9422844E for ; Thu, 30 Jun 2016 18:47:52 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bIgy6-0001pn-Ts; Thu, 30 Jun 2016 18:45:42 +0000 Received: from mail6.bemta6.messagelabs.com ([85.158.143.247]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bIgy5-0001pV-3K for xen-devel@lists.xen.org; Thu, 30 Jun 2016 18:45:41 +0000 Received: from [85.158.143.35] by server-2.bemta-6.messagelabs.com id CF/53-11548-4D865775; Thu, 30 Jun 2016 18:45:40 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrFIsWRWlGSWpSXmKPExsUSfTxjoe7ljNJ wg8MT1CyWfFzM4sDocXT3b6YAxijWzLyk/IoE1ox/y4+wFPRmVyx5N52tgfFHYBcjJ4eQgIfE rv3HmLoYuYDstYwSj3d1MkI4JxklrlzZzQJR5S6xaO96FriqNx8aWEESbALaEucO3WMCsUUEp CWufb4M1s0s8IdRYv+RDkaQhLBAgsTHG1fAGlgEVCVen1kKFucVcJG4s7yBrYuRg0NCQE5iwY V0kDCngKvElt/9zBCLXSR+/wI5AqQkR+LMKWMIU0rif6sSyCYJgaUsElsP/wU7QUJARuLRxJt sExiFFjAyrGJUL04tKkst0jXWSyrKTM8oyU3MzNE1NDDTy00tLk5MT81JTCrWS87P3cQIDEMG INjB2PHP6RCjJAeTkijvvZDScCG+pPyUyozE4oz4otKc1OJDjDIcHEoSvCzAsBYSLEpNT61Iy 8wBRgRMWoKDR0mEd1s6UJq3uCAxtzgzHSJ1ilFRSpx3MkhCACSRUZoH1waLwkuMslLCvIxAhw jxFKQW5WaWoMq/YhTnYFQS5j0KMoUnM68EbvoroMVMQIuZS4tBFpckIqSkGhijTx+ONrio5uM 5s1W+I19I1XO7bpG3x+8rzntvFO1tmrsw9HqJPffHl1GdGjdkrqy+KX286maG9TXHw073vq8o 4N/ItzB7ntCMN2275i2uY3bXWerS2dxyVeOEULUcs5tH1foFE7e5rGDc+9cpJ3XZivWCcp42F 8VOHD7cl8dfyy6uz/ZNIEmJpTgj0VCLuag4EQD4BSrhvQIAAA== X-Env-Sender: czuzu@bitdefender.com X-Msg-Ref: server-8.tower-21.messagelabs.com!1467312339!21673881!1 X-Originating-IP: [91.199.104.161] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 8.46; banners=-,-,- X-VirusChecked: Checked Received: (qmail 51252 invoked from network); 30 Jun 2016 18:45:39 -0000 Received: from mx01.bbu.dsd.mx.bitdefender.com (HELO mx01.bbu.dsd.mx.bitdefender.com) (91.199.104.161) by server-8.tower-21.messagelabs.com with DHE-RSA-AES128-GCM-SHA256 encrypted SMTP; 30 Jun 2016 18:45:39 -0000 Received: (qmail 23159 invoked from network); 30 Jun 2016 21:45:38 +0300 Received: from unknown (HELO mx-sr.buh.bitdefender.com) (10.17.80.103) by mx01.bbu.dsd.mx.bitdefender.com with AES256-GCM-SHA384 encrypted SMTP; 30 Jun 2016 21:45:38 +0300 Received: from smtp01.buh.bitdefender.com (unknown [10.17.80.75]) by mx-sr.buh.bitdefender.com (Postfix) with ESMTP id 623A07FBE5 for ; Thu, 30 Jun 2016 21:45:38 +0300 (EEST) Received: (qmail 1360 invoked from network); 30 Jun 2016 21:45:38 +0300 Received: from 188-24-82-29.rdsnet.ro (HELO localhost.localdomain) (czuzu@bitdefender.com@188.24.82.29) by smtp01.buh.bitdefender.com with SMTP; 30 Jun 2016 21:45:38 +0300 From: Corneliu ZUZU To: xen-devel@lists.xen.org Date: Thu, 30 Jun 2016 21:45:32 +0300 Message-Id: <1467312332-3098-1-git-send-email-czuzu@bitdefender.com> X-Mailer: git-send-email 2.5.0 In-Reply-To: <1467312015-2867-1-git-send-email-czuzu@bitdefender.com> References: <1467312015-2867-1-git-send-email-czuzu@bitdefender.com> X-BitDefender-Scanner: Clean, Agent: BitDefender qmail 3.1.6 on smtp01.buh.bitdefender.com, sigver: 7.66124 X-BitDefender-Spam: No (0) X-BitDefender-SpamStamp: Build: [Engines: 2.15.6.911, Dats: 425113, Stamp: 3], Multi: [Enabled, t: (0.000011, 0.009788)], BW: [Enabled, t: (0.000007,0.000001)], RBL DNSBL: [Disabled], APM: [Enabled, Score: 500, t: (0.008648), Flags: BB9BAF5C; NN_NO_CONTENT_TYPE; NN_LEGIT_SUMM_400_WORDS; NN_NO_LINK_NMD; NN_LEGIT_BITDEFENDER; NN_LEGIT_S_SQARE_BRACKETS; NN_LEGIT_MAILING_LIST_TO], SGN: [Enabled, t: (0.013616)], URL: [Enabled, t: (0.000005)], RTDA: [Enabled, t: (0.027375), Hit: No, Details: v2.3.10; Id: 2m1ghm1.1amgkp5go.152r], total: 0(775) X-BitDefender-CF-Stamp: none Cc: Tamas K Lengyel , Razvan Cojocaru , George Dunlap , Andrew Cooper , Paul Durrant , Jan Beulich Subject: [Xen-devel] [PATCH 5/8] x86/vm-event/monitor: don't compromise monitor_write_data on domain cleanup X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP The arch_vm_event structure is dynamically allocated and freed @ vm_event_cleanup_domain. This cleanup is triggered e.g. when the toolstack user disables domain monitoring (xc_monitor_disable), which in turn effectively discards any information that was in arch_vm_event.write_data. But this can yield unexpected behavior since if a CR-write was awaiting to be committed on the scheduling tail (hvm_do_resume->arch_monitor_write_data) before xc_monitor_disable is called, then the domain CR write is wrongfully ignored, which of course, in these cases, can easily render a domain crash. To fix the issue, this patch makes only arch_vm_event.emul_read_data dynamically allocated instead of the whole arch_vm_event structure. With this we can avoid invalidation of an awaiting arch_vm_event.write_data by selectively cleaning up only emul_read_data and emulate_flags @ vm_event_cleanup_domain. Small note: arch_vm_event structure definition needed to be moved from asm-x86/vm_event.h to asm-x86/domain.h in the process. Signed-off-by: Corneliu ZUZU Acked-by: Razvan Cojocaru --- xen/arch/x86/domain.c | 5 ++-- xen/arch/x86/hvm/emulate.c | 8 +++--- xen/arch/x86/hvm/hvm.c | 62 ++++++++++++++++++------------------------ xen/arch/x86/mm/p2m.c | 4 +-- xen/arch/x86/monitor.c | 7 +---- xen/arch/x86/vm_event.c | 16 +++++------ xen/include/asm-x86/domain.h | 42 +++++++++++++++++----------- xen/include/asm-x86/monitor.h | 3 +- xen/include/asm-x86/vm_event.h | 10 ------- 9 files changed, 73 insertions(+), 84 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index bb59247..06e68ae 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -492,8 +492,9 @@ int vcpu_initialise(struct vcpu *v) void vcpu_destroy(struct vcpu *v) { - xfree(v->arch.vm_event); - v->arch.vm_event = NULL; + v->arch.vm_event.emulate_flags = 0; + xfree(v->arch.vm_event.emul_read_data); + v->arch.vm_event.emul_read_data = NULL; if ( is_pv_32bit_vcpu(v) ) { diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 855af4d..68f5515 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -73,12 +73,12 @@ static int set_context_data(void *buffer, unsigned int size) { struct vcpu *curr = current; - if ( curr->arch.vm_event ) + if ( curr->arch.vm_event.emul_read_data ) { unsigned int safe_size = - min(size, curr->arch.vm_event->emul_read_data.size); + min(size, curr->arch.vm_event.emul_read_data->size); - memcpy(buffer, curr->arch.vm_event->emul_read_data.data, safe_size); + memcpy(buffer, curr->arch.vm_event.emul_read_data->data, safe_size); memset(buffer + safe_size, 0, size - safe_size); return X86EMUL_OKAY; } @@ -523,7 +523,7 @@ static int hvmemul_virtual_to_linear( * vm_event being triggered for repeated writes to a whole page. */ if ( unlikely(current->domain->arch.mem_access_emulate_each_rep) && - current->arch.vm_event->emulate_flags != 0 ) + current->arch.vm_event.emulate_flags != 0 ) max_reps = 1; /* diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 884ae40..03dffb8 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -473,24 +473,24 @@ void hvm_do_resume(struct vcpu *v) if ( !handle_hvm_io_completion(v) ) return; - if ( unlikely(v->arch.vm_event) ) + if ( unlikely(v->arch.vm_event.emulate_flags) ) { - if ( v->arch.vm_event->emulate_flags ) - { - enum emul_kind kind = EMUL_KIND_NORMAL; + enum emul_kind kind; - if ( v->arch.vm_event->emulate_flags & - VM_EVENT_FLAG_SET_EMUL_READ_DATA ) - kind = EMUL_KIND_SET_CONTEXT; - else if ( v->arch.vm_event->emulate_flags & - VM_EVENT_FLAG_EMULATE_NOWRITE ) - kind = EMUL_KIND_NOWRITE; + ASSERT(v->arch.vm_event.emul_read_data); - hvm_mem_access_emulate_one(kind, TRAP_invalid_op, - HVM_DELIVER_NO_ERROR_CODE); + kind = EMUL_KIND_NORMAL; - v->arch.vm_event->emulate_flags = 0; - } + if ( v->arch.vm_event.emulate_flags & VM_EVENT_FLAG_SET_EMUL_READ_DATA ) + kind = EMUL_KIND_SET_CONTEXT; + else if ( v->arch.vm_event.emulate_flags & + VM_EVENT_FLAG_EMULATE_NOWRITE ) + kind = EMUL_KIND_NOWRITE; + + hvm_mem_access_emulate_one(kind, TRAP_invalid_op, + HVM_DELIVER_NO_ERROR_CODE); + + v->arch.vm_event.emulate_flags = 0; } arch_monitor_write_data(v); @@ -2178,17 +2178,15 @@ int hvm_set_cr0(unsigned long value, bool_t may_defer) if ( may_defer && unlikely(v->domain->arch.monitor.write_ctrlreg_enabled & monitor_ctrlreg_bitmask(VM_EVENT_X86_CR0)) ) { - ASSERT(v->arch.vm_event); - if ( hvm_monitor_crX(CR0, value, old_value) ) { /* * The actual write will occur in arch_monitor_write_data(), if * permitted. */ - ASSERT(MWS_NOWRITE == v->arch.vm_event->write_data.status); - v->arch.vm_event->write_data.status = MWS_CR0; - v->arch.vm_event->write_data.value = value; + ASSERT(MWS_NOWRITE == v->arch.vm_event.write_data.status); + v->arch.vm_event.write_data.status = MWS_CR0; + v->arch.vm_event.write_data.value = value; return X86EMUL_OKAY; } @@ -2284,17 +2282,15 @@ int hvm_set_cr3(unsigned long value, bool_t may_defer) if ( may_defer && unlikely(v->domain->arch.monitor.write_ctrlreg_enabled & monitor_ctrlreg_bitmask(VM_EVENT_X86_CR3)) ) { - ASSERT(v->arch.vm_event); - if ( hvm_monitor_crX(CR3, value, old) ) { /* * The actual write will occur in arch_monitor_write_data(), if * permitted. */ - ASSERT(MWS_NOWRITE == v->arch.vm_event->write_data.status); - v->arch.vm_event->write_data.status = MWS_CR3; - v->arch.vm_event->write_data.value = value; + ASSERT(MWS_NOWRITE == v->arch.vm_event.write_data.status); + v->arch.vm_event.write_data.status = MWS_CR3; + v->arch.vm_event.write_data.value = value; return X86EMUL_OKAY; } @@ -2368,17 +2364,15 @@ int hvm_set_cr4(unsigned long value, bool_t may_defer) if ( may_defer && unlikely(v->domain->arch.monitor.write_ctrlreg_enabled & monitor_ctrlreg_bitmask(VM_EVENT_X86_CR4)) ) { - ASSERT(v->arch.vm_event); - if ( hvm_monitor_crX(CR4, value, old_cr) ) { /* * The actual write will occur in arch_monitor_write_data(), if * permitted. */ - ASSERT(MWS_NOWRITE == v->arch.vm_event->write_data.status); - v->arch.vm_event->write_data.status = MWS_CR4; - v->arch.vm_event->write_data.value = value; + ASSERT(MWS_NOWRITE == v->arch.vm_event.write_data.status); + v->arch.vm_event.write_data.status = MWS_CR4; + v->arch.vm_event.write_data.value = value; return X86EMUL_OKAY; } @@ -3753,16 +3747,14 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content, if ( may_defer && unlikely(monitored_msr(v->domain, msr)) ) { - ASSERT(v->arch.vm_event); - /* * The actual write will occur in arch_monitor_write_data(), if * permitted. */ - ASSERT(MWS_NOWRITE == v->arch.vm_event->write_data.status); - v->arch.vm_event->write_data.status = MWS_MSR; - v->arch.vm_event->write_data.msr = msr; - v->arch.vm_event->write_data.value = msr_content; + ASSERT(MWS_NOWRITE == v->arch.vm_event.write_data.status); + v->arch.vm_event.write_data.status = MWS_MSR; + v->arch.vm_event.write_data.msr = msr; + v->arch.vm_event.write_data.value = msr_content; hvm_monitor_msr(msr, msr_content); return X86EMUL_OKAY; diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index 16733a4..9bcaa8a 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1639,10 +1639,10 @@ void p2m_mem_access_emulate_check(struct vcpu *v, } } - v->arch.vm_event->emulate_flags = violation ? rsp->flags : 0; + v->arch.vm_event.emulate_flags = violation ? rsp->flags : 0; if ( (rsp->flags & VM_EVENT_FLAG_SET_EMUL_READ_DATA) ) - v->arch.vm_event->emul_read_data = rsp->data.emul_read_data; + *v->arch.vm_event.emul_read_data = rsp->data.emul_read_data; } } diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c index 5c8d4da..88d14ae 100644 --- a/xen/arch/x86/monitor.c +++ b/xen/arch/x86/monitor.c @@ -46,12 +46,7 @@ void arch_monitor_cleanup_domain(struct domain *d) void arch_monitor_write_data(struct vcpu *v) { - struct monitor_write_data *w; - - if ( likely(!v->arch.vm_event) ) - return; - - w = &v->arch.vm_event->write_data; + struct monitor_write_data *w = &v->arch.vm_event.write_data; if ( likely(MWS_NOWRITE == w->status) ) return; diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c index 825da48..f21ff10 100644 --- a/xen/arch/x86/vm_event.c +++ b/xen/arch/x86/vm_event.c @@ -30,12 +30,13 @@ int vm_event_init_domain(struct domain *d) for_each_vcpu ( d, v ) { - if ( v->arch.vm_event ) + if ( v->arch.vm_event.emul_read_data ) continue; - v->arch.vm_event = xzalloc(struct arch_vm_event); + v->arch.vm_event.emul_read_data = + xzalloc(struct vm_event_emul_read_data); - if ( !v->arch.vm_event ) + if ( !v->arch.vm_event.emul_read_data ) return -ENOMEM; } @@ -52,8 +53,9 @@ void vm_event_cleanup_domain(struct domain *d) for_each_vcpu ( d, v ) { - xfree(v->arch.vm_event); - v->arch.vm_event = NULL; + v->arch.vm_event.emulate_flags = 0; + xfree(v->arch.vm_event.emul_read_data); + v->arch.vm_event.emul_read_data = NULL; } d->arch.mem_access_emulate_each_rep = 0; @@ -73,13 +75,11 @@ void vm_event_register_write_resume(struct vcpu *v, vm_event_response_t *rsp) { if ( rsp->flags & VM_EVENT_FLAG_DENY ) { - ASSERT(v->arch.vm_event); - /* deny flag requires the vCPU to be paused */ if ( !atomic_read(&v->vm_event_pause_count) ) return; - v->arch.vm_event->write_data.status = MWS_NOWRITE; + v->arch.vm_event.write_data.status = MWS_NOWRITE; } } diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h index a22ee6b..7ea5c8f 100644 --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -259,21 +259,6 @@ struct pv_domain struct cpuidmasks *cpuidmasks; }; -enum monitor_write_status -{ - MWS_NOWRITE = 0, - MWS_MSR, - MWS_CR0, - MWS_CR3, - MWS_CR4, -}; - -struct monitor_write_data { - enum monitor_write_status status; - uint32_t msr; - uint64_t value; -}; - struct arch_domain { struct page_info *perdomain_l3_pg; @@ -496,6 +481,31 @@ typedef enum __packed { SMAP_CHECK_DISABLED, /* disable the check */ } smap_check_policy_t; +enum monitor_write_status +{ + MWS_NOWRITE = 0, + MWS_MSR, + MWS_CR0, + MWS_CR3, + MWS_CR4, +}; + +struct monitor_write_data { + enum monitor_write_status status; + uint32_t msr; + uint64_t value; +}; + +/* + * Should we emulate the next matching instruction on VCPU resume + * after a vm_event? + */ +struct arch_vm_event { + uint32_t emulate_flags; + struct vm_event_emul_read_data *emul_read_data; + struct monitor_write_data write_data; +}; + struct arch_vcpu { /* @@ -569,7 +579,7 @@ struct arch_vcpu /* A secondary copy of the vcpu time info. */ XEN_GUEST_HANDLE(vcpu_time_info_t) time_info_guest; - struct arch_vm_event *vm_event; + struct arch_vm_event vm_event; }; smap_check_policy_t smap_policy_change(struct vcpu *v, diff --git a/xen/include/asm-x86/monitor.h b/xen/include/asm-x86/monitor.h index 1068376..984ac4c 100644 --- a/xen/include/asm-x86/monitor.h +++ b/xen/include/asm-x86/monitor.h @@ -48,7 +48,8 @@ int arch_monitor_domctl_op(struct domain *d, struct xen_domctl_monitor_op *mop) * Enabling mem_access_emulate_each_rep without a vm_event subscriber * is meaningless. */ - if ( d->max_vcpus && d->vcpu[0] && d->vcpu[0]->arch.vm_event ) + if ( d->max_vcpus && d->vcpu[0] && + d->vcpu[0]->arch.vm_event.emul_read_data ) d->arch.mem_access_emulate_each_rep = !!mop->event; else rc = -EINVAL; diff --git a/xen/include/asm-x86/vm_event.h b/xen/include/asm-x86/vm_event.h index 026f42e..c83583d 100644 --- a/xen/include/asm-x86/vm_event.h +++ b/xen/include/asm-x86/vm_event.h @@ -22,16 +22,6 @@ #include #include -/* - * Should we emulate the next matching instruction on VCPU resume - * after a vm_event? - */ -struct arch_vm_event { - uint32_t emulate_flags; - struct vm_event_emul_read_data emul_read_data; - struct monitor_write_data write_data; -}; - int vm_event_init_domain(struct domain *d); void vm_event_cleanup_domain(struct domain *d);