From patchwork Wed Sep 25 15:23:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11160949 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F039F13B1 for ; Wed, 25 Sep 2019 15:24:42 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D5BE02054F for ; Wed, 25 Sep 2019 15:24:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D5BE02054F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iD98N-0005ky-D8; Wed, 25 Sep 2019 15:23:15 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iD98M-0005kr-0z for xen-devel@lists.xenproject.org; Wed, 25 Sep 2019 15:23:14 +0000 X-Inumbo-ID: 62f319ae-dfa8-11e9-b588-bc764e2007e4 Received: from mx1.suse.de (unknown [195.135.220.15]) by localhost (Halon) with ESMTPS id 62f319ae-dfa8-11e9-b588-bc764e2007e4; Wed, 25 Sep 2019 15:23:12 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 7324DABCB; Wed, 25 Sep 2019 15:23:11 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: <3ce4ab2c-8cb6-1482-6ce9-3d5b019e10c1@suse.com> Message-ID: Date: Wed, 25 Sep 2019 17:23:11 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <3ce4ab2c-8cb6-1482-6ce9-3d5b019e10c1@suse.com> Content-Language: en-US Subject: [Xen-devel] [PATCH v3 1/5] x86: suppress XPTI-related TLB flushes when possible X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: George Dunlap , Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" When there's no XPTI-enabled PV domain at all, there's no need to issue respective TLB flushes. Hardwire opt_xpti_* to false when !PV, and record the creation of PV domains by bumping opt_xpti_* accordingly. As to the sticky opt_xpti_domu vs increment/decrement of opt_xpti_hwdom, this is done this way to avoid (a) widening the former variable, (b) any risk of a missed flush, which would result in an XSA if a DomU was able to exercise it, and (c) any races updating the variable. Fundamentally the TLB flush done when context switching out the domain's vCPU-s the last time before destroying the domain ought to be sufficient, so in principle DomU handling could be made match hwdom's. Signed-off-by: Jan Beulich Reviewed-by: Roger Pau Monné --- v3: Re-base. v2: Add comment to spec_ctrl.h. Explain difference in accounting of DomU and hwdom. --- TBD: The hardwiring to false could be extended to opt_pv_l1tf_* and (for !HVM) opt_l1d_flush as well. --- xen/arch/x86/flushtlb.c | 2 +- xen/arch/x86/pv/domain.c | 14 +++++++++++++- xen/arch/x86/spec_ctrl.c | 6 ++++++ xen/include/asm-x86/spec_ctrl.h | 11 +++++++++++ 4 files changed, 31 insertions(+), 2 deletions(-) --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -207,7 +207,7 @@ unsigned int flush_area_local(const void */ invpcid_flush_one(PCID_PV_PRIV, addr); invpcid_flush_one(PCID_PV_USER, addr); - if ( opt_xpti_hwdom || opt_xpti_domu ) + if ( opt_xpti_hwdom > 1 || opt_xpti_domu > 1 ) { invpcid_flush_one(PCID_PV_PRIV | PCID_PV_XPTI, addr); invpcid_flush_one(PCID_PV_USER | PCID_PV_XPTI, addr); --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -272,6 +272,9 @@ void pv_domain_destroy(struct domain *d) destroy_perdomain_mapping(d, GDT_LDT_VIRT_START, GDT_LDT_MBYTES << (20 - PAGE_SHIFT)); + opt_xpti_hwdom -= IS_ENABLED(CONFIG_LATE_HWDOM) && + !d->domain_id && opt_xpti_hwdom; + XFREE(d->arch.pv.cpuidmasks); FREE_XENHEAP_PAGE(d->arch.pv.gdt_ldt_l1tab); @@ -310,7 +313,16 @@ int pv_domain_initialise(struct domain * /* 64-bit PV guest by default. */ d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0; - d->arch.pv.xpti = is_hardware_domain(d) ? opt_xpti_hwdom : opt_xpti_domu; + if ( is_hardware_domain(d) && opt_xpti_hwdom ) + { + d->arch.pv.xpti = true; + ++opt_xpti_hwdom; + } + if ( !is_hardware_domain(d) && opt_xpti_domu ) + { + d->arch.pv.xpti = true; + opt_xpti_domu = 2; + } if ( !is_pv_32bit_domain(d) && use_invpcid && cpu_has_pcid ) switch ( ACCESS_ONCE(opt_pcid) ) --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -85,10 +85,12 @@ static int __init parse_spec_ctrl(const opt_eager_fpu = 0; +#ifdef CONFIG_PV if ( opt_xpti_hwdom < 0 ) opt_xpti_hwdom = 0; if ( opt_xpti_domu < 0 ) opt_xpti_domu = 0; +#endif if ( opt_smt < 0 ) opt_smt = 1; @@ -187,6 +189,7 @@ static int __init parse_spec_ctrl(const } custom_param("spec-ctrl", parse_spec_ctrl); +#ifdef CONFIG_PV int8_t __read_mostly opt_xpti_hwdom = -1; int8_t __read_mostly opt_xpti_domu = -1; @@ -253,6 +256,9 @@ static __init int parse_xpti(const char return rc; } custom_param("xpti", parse_xpti); +#else /* !CONFIG_PV */ +# define xpti_init_default(caps) ((void)(caps)) +#endif /* CONFIG_PV */ int8_t __read_mostly opt_pv_l1tf_hwdom = -1; int8_t __read_mostly opt_pv_l1tf_domu = -1; --- a/xen/include/asm-x86/spec_ctrl.h +++ b/xen/include/asm-x86/spec_ctrl.h @@ -43,7 +43,18 @@ extern bool bsp_delay_spec_ctrl; extern uint8_t default_xen_spec_ctrl; extern uint8_t default_spec_ctrl_flags; +#ifdef CONFIG_PV +/* + * Values -1, 0, and 1 have the usual meaning of "not established yet", + * "disabled", and "enabled". Values larger than 1 indicate there's actually + * at least one such domain (or there has been). This way XPTI-specific TLB + * flushes can be avoided when no XPTI-enabled domain is/was active. + */ extern int8_t opt_xpti_hwdom, opt_xpti_domu; +#else +# define opt_xpti_hwdom false +# define opt_xpti_domu false +#endif extern int8_t opt_pv_l1tf_hwdom, opt_pv_l1tf_domu; From patchwork Wed Sep 25 15:23:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11160951 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F1F4514DB for ; Wed, 25 Sep 2019 15:24:58 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D80622054F for ; Wed, 25 Sep 2019 15:24:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D80622054F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iD98u-0005sc-HO; Wed, 25 Sep 2019 15:23:48 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iD98t-0005sH-2p for xen-devel@lists.xenproject.org; Wed, 25 Sep 2019 15:23:47 +0000 X-Inumbo-ID: 76e7ada9-dfa8-11e9-9636-12813bfff9fa Received: from mx1.suse.de (unknown [195.135.220.15]) by localhost (Halon) with ESMTPS id 76e7ada9-dfa8-11e9-9636-12813bfff9fa; Wed, 25 Sep 2019 15:23:46 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B62BBACAA; Wed, 25 Sep 2019 15:23:45 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: <3ce4ab2c-8cb6-1482-6ce9-3d5b019e10c1@suse.com> Message-ID: <74eb1e77-7445-92fa-25b1-ece1d6699eb9@suse.com> Date: Wed, 25 Sep 2019 17:23:45 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <3ce4ab2c-8cb6-1482-6ce9-3d5b019e10c1@suse.com> Content-Language: en-US Subject: [Xen-devel] [PATCH v3 2/5] x86/mm: honor opt_pcid also for 32-bit PV domains X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: George Dunlap , Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" I can't see any technical or performance reason why we should treat 32-bit PV different from 64-bit PV in this regard. Signed-off-by: Jan Beulich Reviewed-by: Roger Pau Monné --- xen/arch/x86/pv/domain.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -180,7 +180,24 @@ int switch_compat(struct domain *d) d->arch.x87_fip_width = 4; d->arch.pv.xpti = false; - d->arch.pv.pcid = false; + + if ( use_invpcid && cpu_has_pcid ) + switch ( ACCESS_ONCE(opt_pcid) ) + { + case PCID_OFF: + case PCID_XPTI: + d->arch.pv.pcid = false; + break; + + case PCID_ALL: + case PCID_NOXPTI: + d->arch.pv.pcid = true; + break; + + default: + ASSERT_UNREACHABLE(); + break; + } return 0; @@ -324,7 +341,7 @@ int pv_domain_initialise(struct domain * opt_xpti_domu = 2; } - if ( !is_pv_32bit_domain(d) && use_invpcid && cpu_has_pcid ) + if ( use_invpcid && cpu_has_pcid ) switch ( ACCESS_ONCE(opt_pcid) ) { case PCID_OFF: From patchwork Wed Sep 25 15:25:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11160955 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C0E3B13B1 for ; Wed, 25 Sep 2019 15:26:48 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A69D3205F4 for ; Wed, 25 Sep 2019 15:26:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A69D3205F4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iD9AL-00067w-Uw; Wed, 25 Sep 2019 15:25:17 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iD9AK-00067h-Ff for xen-devel@lists.xenproject.org; Wed, 25 Sep 2019 15:25:16 +0000 X-Inumbo-ID: ac581f68-dfa8-11e9-9636-12813bfff9fa Received: from mx1.suse.de (unknown [195.135.220.15]) by localhost (Halon) with ESMTPS id ac581f68-dfa8-11e9-9636-12813bfff9fa; Wed, 25 Sep 2019 15:25:15 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 8FCDEACA5; Wed, 25 Sep 2019 15:25:14 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: <3ce4ab2c-8cb6-1482-6ce9-3d5b019e10c1@suse.com> Message-ID: Date: Wed, 25 Sep 2019 17:25:14 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <3ce4ab2c-8cb6-1482-6ce9-3d5b019e10c1@suse.com> Content-Language: en-US Subject: [Xen-devel] [PATCH v3 3/5] x86/HVM: move NOFLUSH handling out of hvm_set_cr3() X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Petre Pircalabu , Kevin Tian , Tamas K Lengyel , Razvan Cojocaru , Wei Liu , Paul Durrant , George Dunlap , Andrew Cooper , Suravee Suthikulpanit , Jun Nakajima , Alexandru Isaila , Boris Ostrovsky , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" The bit is meaningful only for MOV-to-CR3 insns, not anywhere else, in particular not when loading nested guest state. Signed-off-by: Jan Beulich Reviewed-by: Paul Durrant Acked-by: Andrew Cooper --- v3: Further restrict "noflush" local variable scopes. Remove (now redundant) zapping of X86_CR3_NOFLUSH from hvm_monitor_cr(). --- xen/arch/x86/hvm/emulate.c | 8 +++++++- xen/arch/x86/hvm/hvm.c | 20 ++++++++++---------- xen/arch/x86/hvm/monitor.c | 3 --- xen/arch/x86/hvm/svm/nestedsvm.c | 6 +++--- xen/arch/x86/hvm/vm_event.c | 2 +- xen/arch/x86/hvm/vmx/vvmx.c | 4 ++-- xen/include/asm-x86/domain.h | 2 ++ xen/include/asm-x86/hvm/support.h | 2 +- 8 files changed, 26 insertions(+), 21 deletions(-) --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -2123,8 +2123,14 @@ static int hvmemul_write_cr( break; case 3: - rc = hvm_set_cr3(val, true); + { + bool noflush = hvm_pcid_enabled(current) && (val & X86_CR3_NOFLUSH); + + if ( noflush ) + val &= ~X86_CR3_NOFLUSH; + rc = hvm_set_cr3(val, noflush, true); break; + } case 4: rc = hvm_set_cr4(val, true); --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2076,8 +2076,14 @@ int hvm_mov_to_cr(unsigned int cr, unsig break; case 3: - rc = hvm_set_cr3(val, true); + { + bool noflush = hvm_pcid_enabled(curr) && (val & X86_CR3_NOFLUSH); + + if ( noflush ) + val &= ~X86_CR3_NOFLUSH; + rc = hvm_set_cr3(val, noflush, true); break; + } case 4: rc = hvm_set_cr4(val, true); @@ -2294,12 +2300,11 @@ int hvm_set_cr0(unsigned long value, boo return X86EMUL_OKAY; } -int hvm_set_cr3(unsigned long value, bool may_defer) +int hvm_set_cr3(unsigned long value, bool noflush, bool may_defer) { struct vcpu *v = current; struct page_info *page; unsigned long old = v->arch.hvm.guest_cr[3]; - bool noflush = false; if ( may_defer && unlikely(v->domain->arch.monitor.write_ctrlreg_enabled & monitor_ctrlreg_bitmask(VM_EVENT_X86_CR3)) ) @@ -2311,17 +2316,12 @@ int hvm_set_cr3(unsigned long value, boo /* The actual write will occur in hvm_do_resume(), if permitted. */ v->arch.vm_event->write_data.do_write.cr3 = 1; v->arch.vm_event->write_data.cr3 = value; + v->arch.vm_event->write_data.cr3_noflush = noflush; return X86EMUL_OKAY; } } - if ( hvm_pcid_enabled(v) ) /* Clear the noflush bit. */ - { - noflush = value & X86_CR3_NOFLUSH; - value &= ~X86_CR3_NOFLUSH; - } - if ( hvm_paging_enabled(v) && !paging_mode_hap(v->domain) && ((value ^ v->arch.hvm.guest_cr[3]) >> PAGE_SHIFT) ) { @@ -3016,7 +3016,7 @@ void hvm_task_switch( if ( task_switch_load_seg(x86_seg_ldtr, tss.ldt, new_cpl, 0) ) goto out; - rc = hvm_set_cr3(tss.cr3, true); + rc = hvm_set_cr3(tss.cr3, false, true); if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); if ( rc != X86EMUL_OKAY ) --- a/xen/arch/x86/hvm/monitor.c +++ b/xen/arch/x86/hvm/monitor.c @@ -38,9 +38,6 @@ bool hvm_monitor_cr(unsigned int index, struct arch_domain *ad = &curr->domain->arch; unsigned int ctrlreg_bitmask = monitor_ctrlreg_bitmask(index); - if ( index == VM_EVENT_X86_CR3 && hvm_pcid_enabled(curr) ) - value &= ~X86_CR3_NOFLUSH; /* Clear the noflush bit. */ - if ( (ad->monitor.write_ctrlreg_enabled & ctrlreg_bitmask) && (!(ad->monitor.write_ctrlreg_onchangeonly & ctrlreg_bitmask) || value != old) && --- a/xen/arch/x86/hvm/svm/nestedsvm.c +++ b/xen/arch/x86/hvm/svm/nestedsvm.c @@ -324,7 +324,7 @@ static int nsvm_vcpu_hostrestore(struct v->arch.guest_table = pagetable_null(); /* hvm_set_cr3() below sets v->arch.hvm.guest_cr[3] for us. */ } - rc = hvm_set_cr3(n1vmcb->_cr3, true); + rc = hvm_set_cr3(n1vmcb->_cr3, false, true); if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); if (rc != X86EMUL_OKAY) @@ -584,7 +584,7 @@ static int nsvm_vmcb_prepare4vmrun(struc nestedsvm_vmcb_set_nestedp2m(v, ns_vmcb, n2vmcb); /* hvm_set_cr3() below sets v->arch.hvm.guest_cr[3] for us. */ - rc = hvm_set_cr3(ns_vmcb->_cr3, true); + rc = hvm_set_cr3(ns_vmcb->_cr3, false, true); if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); if (rc != X86EMUL_OKAY) @@ -598,7 +598,7 @@ static int nsvm_vmcb_prepare4vmrun(struc * we assume it intercepts page faults. */ /* hvm_set_cr3() below sets v->arch.hvm.guest_cr[3] for us. */ - rc = hvm_set_cr3(ns_vmcb->_cr3, true); + rc = hvm_set_cr3(ns_vmcb->_cr3, false, true); if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); if (rc != X86EMUL_OKAY) --- a/xen/arch/x86/hvm/vm_event.c +++ b/xen/arch/x86/hvm/vm_event.c @@ -110,7 +110,7 @@ void hvm_vm_event_do_resume(struct vcpu if ( unlikely(w->do_write.cr3) ) { - if ( hvm_set_cr3(w->cr3, false) == X86EMUL_EXCEPTION ) + if ( hvm_set_cr3(w->cr3, w->cr3_noflush, false) == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); w->do_write.cr3 = 0; --- a/xen/arch/x86/hvm/vmx/vvmx.c +++ b/xen/arch/x86/hvm/vmx/vvmx.c @@ -1032,7 +1032,7 @@ static void load_shadow_guest_state(stru if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); - rc = hvm_set_cr3(get_vvmcs(v, GUEST_CR3), true); + rc = hvm_set_cr3(get_vvmcs(v, GUEST_CR3), false, true); if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); @@ -1246,7 +1246,7 @@ static void load_vvmcs_host_state(struct if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); - rc = hvm_set_cr3(get_vvmcs(v, HOST_CR3), true); + rc = hvm_set_cr3(get_vvmcs(v, HOST_CR3), false, true); if ( rc == X86EMUL_EXCEPTION ) hvm_inject_hw_exception(TRAP_gp_fault, 0); --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -274,6 +274,8 @@ struct monitor_write_data { unsigned int cr4 : 1; } do_write; + bool cr3_noflush; + uint32_t msr; uint64_t value; uint64_t cr0; --- a/xen/include/asm-x86/hvm/support.h +++ b/xen/include/asm-x86/hvm/support.h @@ -136,7 +136,7 @@ void hvm_shadow_handle_cd(struct vcpu *v */ int hvm_set_efer(uint64_t value); int hvm_set_cr0(unsigned long value, bool may_defer); -int hvm_set_cr3(unsigned long value, bool may_defer); +int hvm_set_cr3(unsigned long value, bool noflush, bool may_defer); int hvm_set_cr4(unsigned long value, bool may_defer); int hvm_descriptor_access_intercept(uint64_t exit_info, uint64_t vmx_exit_qualification, From patchwork Wed Sep 25 15:25:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11160959 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 39F8D14DB for ; Wed, 25 Sep 2019 15:27:30 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1F04F205F4 for ; Wed, 25 Sep 2019 15:27:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1F04F205F4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iD9Ax-0006Dj-KD; Wed, 25 Sep 2019 15:25:55 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iD9Aw-0006DR-3P for xen-devel@lists.xenproject.org; Wed, 25 Sep 2019 15:25:54 +0000 X-Inumbo-ID: c2e54a94-dfa8-11e9-bf31-bc764e2007e4 Received: from mx1.suse.de (unknown [195.135.220.15]) by localhost (Halon) with ESMTPS id c2e54a94-dfa8-11e9-bf31-bc764e2007e4; Wed, 25 Sep 2019 15:25:53 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 9EAC6ACA5; Wed, 25 Sep 2019 15:25:52 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: <3ce4ab2c-8cb6-1482-6ce9-3d5b019e10c1@suse.com> Message-ID: Date: Wed, 25 Sep 2019 17:25:53 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <3ce4ab2c-8cb6-1482-6ce9-3d5b019e10c1@suse.com> Content-Language: en-US Subject: [Xen-devel] [PATCH v3 4/5] x86/HVM: refuse CR3 loads with reserved (upper) bits set X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: George Dunlap , Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" While bits 11 and below are, if not used for other purposes, reserved but ignored, bits beyond physical address width are supposed to raise exceptions (at least in the non-nested case; I'm not convinced the current nested SVM/VMX behavior of raising #GP(0) here is correct, but that's not the subject of this change). Introduce currd as a local variable, and replace other v->domain instances at the same time. Signed-off-by: Jan Beulich Reviewed-by: Roger Pau Monné Reviewed-by: Andrew Cooper --- v3: Correct return value in hvm_load_cpu_ctxt(). Re-base. v2: Simplify the expressions used for the reserved bit checks. --- xen/arch/x86/hvm/hvm.c | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-) --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1016,6 +1016,13 @@ static int hvm_load_cpu_ctxt(struct doma return -EINVAL; } + if ( ctxt.cr3 >> d->arch.cpuid->extd.maxphysaddr ) + { + printk(XENLOG_G_ERR "HVM%d restore: bad CR3 %#" PRIx64 "\n", + d->domain_id, ctxt.cr3); + return -EINVAL; + } + if ( (ctxt.flags & ~XEN_X86_FPU_INITIALISED) != 0 ) { gprintk(XENLOG_ERR, "bad flags value in CPU context: %#x\n", @@ -2303,10 +2310,18 @@ int hvm_set_cr0(unsigned long value, boo int hvm_set_cr3(unsigned long value, bool noflush, bool may_defer) { struct vcpu *v = current; + struct domain *currd = v->domain; struct page_info *page; unsigned long old = v->arch.hvm.guest_cr[3]; - if ( may_defer && unlikely(v->domain->arch.monitor.write_ctrlreg_enabled & + if ( value >> currd->arch.cpuid->extd.maxphysaddr ) + { + HVM_DBG_LOG(DBG_LEVEL_1, + "Attempt to set reserved CR3 bit(s): %lx", value); + return X86EMUL_EXCEPTION; + } + + if ( may_defer && unlikely(currd->arch.monitor.write_ctrlreg_enabled & monitor_ctrlreg_bitmask(VM_EVENT_X86_CR3)) ) { ASSERT(v->arch.vm_event); @@ -2322,13 +2337,12 @@ int hvm_set_cr3(unsigned long value, boo } } - if ( hvm_paging_enabled(v) && !paging_mode_hap(v->domain) && + if ( hvm_paging_enabled(v) && !paging_mode_hap(currd) && ((value ^ v->arch.hvm.guest_cr[3]) >> PAGE_SHIFT) ) { /* Shadow-mode CR3 change. Check PDBR and update refcounts. */ HVM_DBG_LOG(DBG_LEVEL_VMMU, "CR3 value = %lx", value); - page = get_page_from_gfn(v->domain, value >> PAGE_SHIFT, - NULL, P2M_ALLOC); + page = get_page_from_gfn(currd, value >> PAGE_SHIFT, NULL, P2M_ALLOC); if ( !page ) goto bad_cr3; @@ -2344,7 +2358,7 @@ int hvm_set_cr3(unsigned long value, boo bad_cr3: gdprintk(XENLOG_ERR, "Invalid CR3\n"); - domain_crash(v->domain); + domain_crash(currd); return X86EMUL_UNHANDLEABLE; } From patchwork Wed Sep 25 15:26:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11160961 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B3BD513B1 for ; Wed, 25 Sep 2019 15:27:36 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 98E38207E0 for ; Wed, 25 Sep 2019 15:27:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 98E38207E0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iD9BM-0006Jn-UG; Wed, 25 Sep 2019 15:26:20 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iD9BL-0006JX-GV for xen-devel@lists.xenproject.org; Wed, 25 Sep 2019 15:26:19 +0000 X-Inumbo-ID: d2125c82-dfa8-11e9-9637-12813bfff9fa Received: from mx1.suse.de (unknown [195.135.220.15]) by localhost (Halon) with ESMTPS id d2125c82-dfa8-11e9-9637-12813bfff9fa; Wed, 25 Sep 2019 15:26:18 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 1863EAE89; Wed, 25 Sep 2019 15:26:18 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: <3ce4ab2c-8cb6-1482-6ce9-3d5b019e10c1@suse.com> Message-ID: <3ce2aac4-de6c-7197-751d-34858305dfd9@suse.com> Date: Wed, 25 Sep 2019 17:26:18 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <3ce4ab2c-8cb6-1482-6ce9-3d5b019e10c1@suse.com> Content-Language: en-US Subject: [Xen-devel] [PATCH v3 5/5] x86/HVM: cosmetics to hvm_set_cr3() X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: George Dunlap , Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Eliminate the not really useful local variable "old". Reduce the scope of "page". Rename the latched "current". Signed-off-by: Jan Beulich Reviewed-by: Roger Pau Monné Acked-by: Andrew Cooper --- v2: Re-base over change earlier in the series. --- xen/arch/x86/hvm/hvm.c | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2309,10 +2309,8 @@ int hvm_set_cr0(unsigned long value, boo int hvm_set_cr3(unsigned long value, bool noflush, bool may_defer) { - struct vcpu *v = current; - struct domain *currd = v->domain; - struct page_info *page; - unsigned long old = v->arch.hvm.guest_cr[3]; + struct vcpu *curr = current; + struct domain *currd = curr->domain; if ( value >> currd->arch.cpuid->extd.maxphysaddr ) { @@ -2324,36 +2322,38 @@ int hvm_set_cr3(unsigned long value, boo if ( may_defer && unlikely(currd->arch.monitor.write_ctrlreg_enabled & monitor_ctrlreg_bitmask(VM_EVENT_X86_CR3)) ) { - ASSERT(v->arch.vm_event); + ASSERT(curr->arch.vm_event); - if ( hvm_monitor_crX(CR3, value, old) ) + if ( hvm_monitor_crX(CR3, value, curr->arch.hvm.guest_cr[3]) ) { /* The actual write will occur in hvm_do_resume(), if permitted. */ - v->arch.vm_event->write_data.do_write.cr3 = 1; - v->arch.vm_event->write_data.cr3 = value; - v->arch.vm_event->write_data.cr3_noflush = noflush; + curr->arch.vm_event->write_data.do_write.cr3 = 1; + curr->arch.vm_event->write_data.cr3 = value; + curr->arch.vm_event->write_data.cr3_noflush = noflush; return X86EMUL_OKAY; } } - if ( hvm_paging_enabled(v) && !paging_mode_hap(currd) && - ((value ^ v->arch.hvm.guest_cr[3]) >> PAGE_SHIFT) ) + if ( hvm_paging_enabled(curr) && !paging_mode_hap(currd) && + ((value ^ curr->arch.hvm.guest_cr[3]) >> PAGE_SHIFT) ) { /* Shadow-mode CR3 change. Check PDBR and update refcounts. */ + struct page_info *page; + HVM_DBG_LOG(DBG_LEVEL_VMMU, "CR3 value = %lx", value); page = get_page_from_gfn(currd, value >> PAGE_SHIFT, NULL, P2M_ALLOC); if ( !page ) goto bad_cr3; - put_page(pagetable_get_page(v->arch.guest_table)); - v->arch.guest_table = pagetable_from_page(page); + put_page(pagetable_get_page(curr->arch.guest_table)); + curr->arch.guest_table = pagetable_from_page(page); HVM_DBG_LOG(DBG_LEVEL_VMMU, "Update CR3 value = %lx", value); } - v->arch.hvm.guest_cr[3] = value; - paging_update_cr3(v, noflush); + curr->arch.hvm.guest_cr[3] = value; + paging_update_cr3(curr, noflush); return X86EMUL_OKAY; bad_cr3: