Message ID | 1453921984-29197-1-git-send-email-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 01/27/2016 02:13 PM, Andrew Cooper wrote: > c/s 0f1cb96e "x86 hvm: Allow cross-vendor migration" caused HVM domains to > unconditionally intercept #UD exceptions. While cross-vendor migration is > cool as a demo, it is extremely niche. > > Intercepting #UD allows userspace code in a multi-vcpu guest to execute > arbitrary instructions in the x86 emulator by having one thread execute a ud2a > instruction, and having a second thread rewrite the instruction before the > emulator performs an instruction fetch. > > XSAs 105, 106 and 110 are all examples where guest userspace can use bugs in > the x86 emulator to compromise security of the domain, either by privilege > escalation or causing a crash. > > c/s 2d67a7a4 "x86: synchronize PCI config space access decoding" > introduced (amongst other things) a per-domain vendor, based on the guests > cpuid policy. > > Use the per-guest vendor to enable #UD interception only when a domain is > configured for a vendor different to the current hardware. (#UD interception > is also enabled if hvm_fep is specified on the Xen command line. This is a > debug-only option whose entire purpose is for testing the x86 emulator.) > > As a result, the overwhelming majority of usecases now have #UD interception > disabled, removing an attack surface for malicious guest userspace. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index 674feea..7a15d49 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -93,12 +93,10 @@ unsigned long __section(".bss.page_aligned") > static bool_t __initdata opt_hap_enabled = 1; > boolean_param("hap", opt_hap_enabled); > > -#ifndef NDEBUG > +#ifndef opt_hvm_fep > /* Permit use of the Forced Emulation Prefix in HVM guests */ > -static bool_t opt_hvm_fep; > +bool_t opt_hvm_fep; > boolean_param("hvm_fep", opt_hvm_fep); Since you remove the debug option you should probably also update the documentation which says: ">Recognized in debug builds of the hypervisor only."
On 27/01/2016 19:52, Konrad Rzeszutek Wilk wrote: >> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c >> index 674feea..7a15d49 100644 >> --- a/xen/arch/x86/hvm/hvm.c >> +++ b/xen/arch/x86/hvm/hvm.c >> @@ -93,12 +93,10 @@ unsigned long __section(".bss.page_aligned") >> static bool_t __initdata opt_hap_enabled = 1; >> boolean_param("hap", opt_hap_enabled); >> >> -#ifndef NDEBUG >> +#ifndef opt_hvm_fep >> /* Permit use of the Forced Emulation Prefix in HVM guests */ >> -static bool_t opt_hvm_fep; >> +bool_t opt_hvm_fep; >> boolean_param("hvm_fep", opt_hvm_fep); > Since you remove the debug option you should probably also update the > documentation which says: ">Recognized in debug builds of the hypervisor only." This doesn't change the "debug-only"-ness of the option. Observe in the first hunk to hvm.h that opt_hvm_fep is defined to 0 in a non-debug build, which causes this hunk to be omitted. This actually matches the original introduction of opt_hvm_fep, before it was reduced in scope to only hvm.c. I now need it available again in other translation units. ~Andrew
On Wed, Jan 27, 2016 at 07:57:00PM +0000, Andrew Cooper wrote: > On 27/01/2016 19:52, Konrad Rzeszutek Wilk wrote: > >> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > >> index 674feea..7a15d49 100644 > >> --- a/xen/arch/x86/hvm/hvm.c > >> +++ b/xen/arch/x86/hvm/hvm.c > >> @@ -93,12 +93,10 @@ unsigned long __section(".bss.page_aligned") > >> static bool_t __initdata opt_hap_enabled = 1; > >> boolean_param("hap", opt_hap_enabled); > >> > >> -#ifndef NDEBUG > >> +#ifndef opt_hvm_fep > >> /* Permit use of the Forced Emulation Prefix in HVM guests */ > >> -static bool_t opt_hvm_fep; > >> +bool_t opt_hvm_fep; > >> boolean_param("hvm_fep", opt_hvm_fep); > > Since you remove the debug option you should probably also update the > > documentation which says: ">Recognized in debug builds of the hypervisor only." > > This doesn't change the "debug-only"-ness of the option. > > Observe in the first hunk to hvm.h that opt_hvm_fep is defined to 0 in a > non-debug build, which causes this hunk to be omitted. I missed that. Sorry for the noise. > > This actually matches the original introduction of opt_hvm_fep, before > it was reduced in scope to only hvm.c. I now need it available again in > other translation units. > > ~Andrew
>>> On 27.01.16 at 20:13, <andrew.cooper3@citrix.com> wrote: > --- a/xen/arch/x86/hvm/svm/vmcb.c > +++ b/xen/arch/x86/hvm/svm/vmcb.c > @@ -192,6 +192,7 @@ static int construct_vmcb(struct vcpu *v) > > vmcb->_exception_intercepts = > HVM_TRAP_MASK > + | (opt_hvm_fep ? (1U << TRAP_invalid_op) : 0) > | (1U << TRAP_no_device); This assumes a certain sequence of hypercalls by the tool stack (i.e. set-cpuid only after all vCPU-s got created, or else the intercept won't get enabled), which I think we should avoid. Instead I think you'd better call the new hook from hvm_vcpu_initialise(). Iif the above is not an option for some reason, and considering you do the same change in vmcs.c, wouldn't it make sense to extend HVM_TRAP_MASK accordingly? Jan
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index 1d71216..316e13a 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -65,8 +65,18 @@ static void update_domain_cpuid_info(struct domain *d, .ecx = ctl->ecx } }; + int old_vendor = d->arch.x86_vendor; d->arch.x86_vendor = get_cpu_vendor(vendor_id.str, gcv_guest); + + if ( is_hvm_domain(d) && (d->arch.x86_vendor != old_vendor) ) + { + struct vcpu *v; + + for_each_vcpu( d, v ) + hvm_update_guest_vendor(v); + } + break; } @@ -707,6 +717,12 @@ long arch_do_domctl( xen_domctl_cpuid_t *ctl = &domctl->u.cpuid; cpuid_input_t *cpuid, *unused = NULL; + if ( d == currd ) /* no domain_pause() */ + { + ret = -EINVAL; + break; + } + for ( i = 0; i < MAX_CPUID_INPUT; i++ ) { cpuid = &d->arch.cpuids[i]; @@ -724,6 +740,8 @@ long arch_do_domctl( break; } + domain_pause(d); + if ( i < MAX_CPUID_INPUT ) *cpuid = *ctl; else if ( unused ) @@ -734,6 +752,7 @@ long arch_do_domctl( if ( !ret ) update_domain_cpuid_info(d, ctl); + domain_unpause(d); break; } diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 674feea..7a15d49 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -93,12 +93,10 @@ unsigned long __section(".bss.page_aligned") static bool_t __initdata opt_hap_enabled = 1; boolean_param("hap", opt_hap_enabled); -#ifndef NDEBUG +#ifndef opt_hvm_fep /* Permit use of the Forced Emulation Prefix in HVM guests */ -static bool_t opt_hvm_fep; +bool_t opt_hvm_fep; boolean_param("hvm_fep", opt_hvm_fep); -#else -#define opt_hvm_fep 0 #endif /* Xen command-line option to enable altp2m */ diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index 953e0b5..e62dfa1 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -597,6 +597,21 @@ static void svm_update_guest_efer(struct vcpu *v) vmcb_set_efer(vmcb, new_efer); } +static void svm_update_guest_vendor(struct vcpu *v) +{ + struct arch_svm_struct *arch_svm = &v->arch.hvm_svm; + struct vmcb_struct *vmcb = arch_svm->vmcb; + u32 bitmap = vmcb_get_exception_intercepts(vmcb); + + if ( opt_hvm_fep || + (v->domain->arch.x86_vendor != boot_cpu_data.x86_vendor) ) + bitmap |= (1U << TRAP_invalid_op); + else + bitmap &= ~(1U << TRAP_invalid_op); + + vmcb_set_exception_intercepts(vmcb, bitmap); +} + static void svm_sync_vmcb(struct vcpu *v) { struct arch_svm_struct *arch_svm = &v->arch.hvm_svm; @@ -2245,6 +2260,7 @@ static struct hvm_function_table __initdata svm_function_table = { .get_shadow_gs_base = svm_get_shadow_gs_base, .update_guest_cr = svm_update_guest_cr, .update_guest_efer = svm_update_guest_efer, + .update_guest_vendor = svm_update_guest_vendor, .set_guest_pat = svm_set_guest_pat, .get_guest_pat = svm_get_guest_pat, .set_tsc_offset = svm_set_tsc_offset, diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c index 9ea014f..be2dc32 100644 --- a/xen/arch/x86/hvm/svm/vmcb.c +++ b/xen/arch/x86/hvm/svm/vmcb.c @@ -192,6 +192,7 @@ static int construct_vmcb(struct vcpu *v) vmcb->_exception_intercepts = HVM_TRAP_MASK + | (opt_hvm_fep ? (1U << TRAP_invalid_op) : 0) | (1U << TRAP_no_device); if ( paging_mode_hap(v->domain) ) diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c index 5bc3c74..a12813a 100644 --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -1237,6 +1237,7 @@ static int construct_vmcs(struct vcpu *v) v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK | (paging_mode_hap(d) ? 0 : (1U << TRAP_page_fault)) + | (opt_hvm_fep ? (1U << TRAP_invalid_op) : 0) | (1U << TRAP_no_device); vmx_update_exception_bitmap(v); diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 4f9951f..195def6 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -73,6 +73,7 @@ static void vmx_free_vlapic_mapping(struct domain *d); static void vmx_install_vlapic_mapping(struct vcpu *v); static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr); static void vmx_update_guest_efer(struct vcpu *v); +static void vmx_update_guest_vendor(struct vcpu *v); static void vmx_cpuid_intercept( unsigned int *eax, unsigned int *ebx, unsigned int *ecx, unsigned int *edx); @@ -398,6 +399,19 @@ void vmx_update_exception_bitmap(struct vcpu *v) __vmwrite(EXCEPTION_BITMAP, bitmap); } +static void vmx_update_guest_vendor(struct vcpu *v) +{ + if ( opt_hvm_fep || + (v->domain->arch.x86_vendor != boot_cpu_data.x86_vendor) ) + v->arch.hvm_vmx.exception_bitmap |= (1U << TRAP_invalid_op); + else + v->arch.hvm_vmx.exception_bitmap &= ~(1U << TRAP_invalid_op); + + vmx_vmcs_enter(v); + vmx_update_exception_bitmap(v); + vmx_vmcs_exit(v); +} + static int vmx_guest_x86_mode(struct vcpu *v) { unsigned long cs_ar_bytes; @@ -1963,6 +1977,7 @@ static struct hvm_function_table __initdata vmx_function_table = { .update_host_cr3 = vmx_update_host_cr3, .update_guest_cr = vmx_update_guest_cr, .update_guest_efer = vmx_update_guest_efer, + .update_guest_vendor = vmx_update_guest_vendor, .set_guest_pat = vmx_set_guest_pat, .get_guest_pat = vmx_get_guest_pat, .set_tsc_offset = vmx_set_tsc_offset, diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h index a87224b..0b15616 100644 --- a/xen/include/asm-x86/hvm/hvm.h +++ b/xen/include/asm-x86/hvm/hvm.h @@ -28,6 +28,13 @@ #include <public/hvm/ioreq.h> #include <xen/mm.h> +#ifndef NDEBUG +/* Permit use of the Forced Emulation Prefix in HVM guests */ +extern bool_t opt_hvm_fep; +#else +#define opt_hvm_fep 0 +#endif + /* Interrupt acknowledgement sources. */ enum hvm_intsrc { hvm_intsrc_none, @@ -136,6 +143,8 @@ struct hvm_function_table { void (*update_guest_cr)(struct vcpu *v, unsigned int cr); void (*update_guest_efer)(struct vcpu *v); + void (*update_guest_vendor)(struct vcpu *v); + int (*get_guest_pat)(struct vcpu *v, u64 *); int (*set_guest_pat)(struct vcpu *v, u64); @@ -316,6 +325,11 @@ static inline void hvm_update_guest_efer(struct vcpu *v) hvm_funcs.update_guest_efer(v); } +static inline void hvm_update_guest_vendor(struct vcpu *v) +{ + hvm_funcs.update_guest_vendor(v); +} + /* * Called to ensure than all guest-specific mappings in a tagged TLB are * flushed; does *not* flush Xen's TLB entries, and on processors without a @@ -387,7 +401,6 @@ static inline int hvm_event_pending(struct vcpu *v) /* These exceptions must always be intercepted. */ #define HVM_TRAP_MASK ((1U << TRAP_debug) | \ - (1U << TRAP_invalid_op) | \ (1U << TRAP_alignment_check) | \ (1U << TRAP_machine_check))
c/s 0f1cb96e "x86 hvm: Allow cross-vendor migration" caused HVM domains to unconditionally intercept #UD exceptions. While cross-vendor migration is cool as a demo, it is extremely niche. Intercepting #UD allows userspace code in a multi-vcpu guest to execute arbitrary instructions in the x86 emulator by having one thread execute a ud2a instruction, and having a second thread rewrite the instruction before the emulator performs an instruction fetch. XSAs 105, 106 and 110 are all examples where guest userspace can use bugs in the x86 emulator to compromise security of the domain, either by privilege escalation or causing a crash. c/s 2d67a7a4 "x86: synchronize PCI config space access decoding" introduced (amongst other things) a per-domain vendor, based on the guests cpuid policy. Use the per-guest vendor to enable #UD interception only when a domain is configured for a vendor different to the current hardware. (#UD interception is also enabled if hvm_fep is specified on the Xen command line. This is a debug-only option whose entire purpose is for testing the x86 emulator.) As a result, the overwhelming majority of usecases now have #UD interception disabled, removing an attack surface for malicious guest userspace. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Jun Nakajima <jun.nakajima@intel.com> CC: Kevin Tian <kevin.tian@intel.com> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> CC: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> v2: * Pause the domain while updating cpuid information. In practice, the set_cpuid hypercall is only made during domain construction. * Use vmcb_{get,set}_exception_intercepts() to provide appropriate manipulation of the clean bits. --- xen/arch/x86/domctl.c | 19 +++++++++++++++++++ xen/arch/x86/hvm/hvm.c | 6 ++---- xen/arch/x86/hvm/svm/svm.c | 16 ++++++++++++++++ xen/arch/x86/hvm/svm/vmcb.c | 1 + xen/arch/x86/hvm/vmx/vmcs.c | 1 + xen/arch/x86/hvm/vmx/vmx.c | 15 +++++++++++++++ xen/include/asm-x86/hvm/hvm.h | 15 ++++++++++++++- 7 files changed, 68 insertions(+), 5 deletions(-)