From patchwork Fri May 5 14:48:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Liu X-Patchwork-Id: 9713733 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D7463602B9 for ; Fri, 5 May 2017 15:19:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C0D142861E for ; Fri, 5 May 2017 15:19:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B56AF2863A; Fri, 5 May 2017 15:19:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8C5812861E for ; Fri, 5 May 2017 15:19:31 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d6eyv-0004WT-HM; Fri, 05 May 2017 15:17:21 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d6eyt-0004TT-IS for xen-devel@lists.xenproject.org; Fri, 05 May 2017 15:17:19 +0000 Received: from [85.158.137.68] by server-6.bemta-3.messagelabs.com id 68/58-02189-E779C095; Fri, 05 May 2017 15:17:18 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmplkeJIrShJLcpLzFFi42JxWrrBXrd6Ok+ kwc6dhhbft0xmcmD0OPzhCksAYxRrZl5SfkUCa8aipZsYC3bsZaxY83M2ewPji27GLkZODgkB f4l11xYxgdhsAsoSPzt72UBsEQE9iaYDz8FqmAXyJM6uXsQKYgsL+EjcuLQdrJ5FQEXixOmfQ DUcHLwCFhKf5rtDjJSX2NV2EaycEyj8feVxFhBbSMBc4s+Du1C2gkTH9GNgY3gFBCVOznzCAr FKQuLgixfMExh5ZyFJzUKSWsDItIpRvTi1qCy1SNdEL6koMz2jJDcxM0fX0MBYLze1uDgxPTU nMalYLzk/dxMjMHgYgGAHY+MXp0OMkhxMSqK8aaI8kUJ8SfkplRmJxRnxRaU5qcWHGGU4OJQk eBunAeUEi1LTUyvSMnOAYQyTluDgURLhPTkVKM1bXJCYW5yZDpE6xagoJc4bDdInAJLIKM2Da 4PFziVGWSlhXkagQ4R4ClKLcjNLUOVfMYpzMCoJ8/qCTOHJzCuBm/4KaDET0OJoUbDFJYkIKa kGRoFUpk7VIzfMHSp33QvwVDFf4KdvrK7L9WNV06kTidMm/9ttKjfzk73vpgsJOxZ80eM0lLh 3rk0vPkgtIzco6dekQsfCLZV35KaX3g/SfHpnrfiRXP6GDyYZ/5ua1rOeEL9wN/3bBddb8Q9v TJW7aVdwMv+euMgFQY4sr/4V7+OTS700ntfoKLEUZyQaajEXFScCAEu1D4yYAgAA X-Env-Sender: prvs=29123ec31=wei.liu2@citrix.com X-Msg-Ref: server-14.tower-31.messagelabs.com!1493997430!98969617!2 X-Originating-IP: [66.165.176.63] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni42MyA9PiAzMDYwNDg=\n, received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 9.4.12; banners=-,-,- X-VirusChecked: Checked Received: (qmail 16589 invoked from network); 5 May 2017 15:17:14 -0000 Received: from smtp02.citrix.com (HELO SMTP02.CITRIX.COM) (66.165.176.63) by server-14.tower-31.messagelabs.com with RC4-SHA encrypted SMTP; 5 May 2017 15:17:14 -0000 X-IronPort-AV: E=Sophos;i="5.38,293,1491264000"; d="scan'208";a="431077861" From: Wei Liu To: Xen-devel Date: Fri, 5 May 2017 15:48:31 +0100 Message-ID: <20170505144836.8612-14-wei.liu2@citrix.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170505144836.8612-1-wei.liu2@citrix.com> References: <20170505144836.8612-1-wei.liu2@citrix.com> MIME-Version: 1.0 Cc: Andrew Cooper , Wei Liu , Jan Beulich Subject: [Xen-devel] [PATCH v2 13/18] x86/traps: move PV specific code in x86_64/traps.c X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Move them to pv/traps.c. This in turn requires exporting pv_percpu_traps_init and hypercall_page_initialise_ring3_kernel. No functional change. Signed-off-by: Wei Liu --- xen/arch/x86/pv/traps.c | 363 ++++++++++++++++++++++++++++++++++++++++ xen/arch/x86/x86_64/traps.c | 363 +--------------------------------------- xen/include/asm-x86/pv/domain.h | 5 + xen/include/asm-x86/pv/traps.h | 4 + 4 files changed, 374 insertions(+), 361 deletions(-) diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c index f2627b4215..178ac2773c 100644 --- a/xen/arch/x86/pv/traps.c +++ b/xen/arch/x86/pv/traps.c @@ -32,6 +32,8 @@ #include #include +#include + void do_entry_int82(struct cpu_user_regs *regs) { if ( unlikely(untrusted_msi) ) @@ -323,6 +325,367 @@ int send_guest_trap(struct domain *d, uint16_t vcpuid, unsigned int trap_nr) return -EIO; } +void toggle_guest_mode(struct vcpu *v) +{ + if ( is_pv_32bit_vcpu(v) ) + return; + if ( cpu_has_fsgsbase ) + { + if ( v->arch.flags & TF_kernel_mode ) + v->arch.pv_vcpu.gs_base_kernel = __rdgsbase(); + else + v->arch.pv_vcpu.gs_base_user = __rdgsbase(); + } + v->arch.flags ^= TF_kernel_mode; + asm volatile ( "swapgs" ); + update_cr3(v); + /* Don't flush user global mappings from the TLB. Don't tick TLB clock. */ + asm volatile ( "mov %0, %%cr3" : : "r" (v->arch.cr3) : "memory" ); + + if ( !(v->arch.flags & TF_kernel_mode) ) + return; + + if ( v->arch.pv_vcpu.need_update_runstate_area && + update_runstate_area(v) ) + v->arch.pv_vcpu.need_update_runstate_area = 0; + + if ( v->arch.pv_vcpu.pending_system_time.version && + update_secondary_system_time(v, + &v->arch.pv_vcpu.pending_system_time) ) + v->arch.pv_vcpu.pending_system_time.version = 0; +} + +unsigned long do_iret(void) +{ + struct cpu_user_regs *regs = guest_cpu_user_regs(); + struct iret_context iret_saved; + struct vcpu *v = current; + + if ( unlikely(copy_from_user(&iret_saved, (void *)regs->rsp, + sizeof(iret_saved))) ) + { + gprintk(XENLOG_ERR, + "Fault while reading IRET context from guest stack\n"); + goto exit_and_crash; + } + + /* Returning to user mode? */ + if ( (iret_saved.cs & 3) == 3 ) + { + if ( unlikely(pagetable_is_null(v->arch.guest_table_user)) ) + { + gprintk(XENLOG_ERR, + "Guest switching to user mode with no user page tables\n"); + goto exit_and_crash; + } + toggle_guest_mode(v); + } + + if ( VM_ASSIST(v->domain, architectural_iopl) ) + v->arch.pv_vcpu.iopl = iret_saved.rflags & X86_EFLAGS_IOPL; + + regs->rip = iret_saved.rip; + regs->cs = iret_saved.cs | 3; /* force guest privilege */ + regs->rflags = ((iret_saved.rflags & ~(X86_EFLAGS_IOPL|X86_EFLAGS_VM)) + | X86_EFLAGS_IF); + regs->rsp = iret_saved.rsp; + regs->ss = iret_saved.ss | 3; /* force guest privilege */ + + if ( !(iret_saved.flags & VGCF_in_syscall) ) + { + regs->entry_vector &= ~TRAP_syscall; + regs->r11 = iret_saved.r11; + regs->rcx = iret_saved.rcx; + } + + /* Restore upcall mask from supplied EFLAGS.IF. */ + vcpu_info(v, evtchn_upcall_mask) = !(iret_saved.rflags & X86_EFLAGS_IF); + + async_exception_cleanup(v); + + /* Saved %rax gets written back to regs->rax in entry.S. */ + return iret_saved.rax; + + exit_and_crash: + domain_crash(v->domain); + return 0; +} + +static unsigned int write_stub_trampoline( + unsigned char *stub, unsigned long stub_va, + unsigned long stack_bottom, unsigned long target_va) +{ + /* movabsq %rax, stack_bottom - 8 */ + stub[0] = 0x48; + stub[1] = 0xa3; + *(uint64_t *)&stub[2] = stack_bottom - 8; + + /* movq %rsp, %rax */ + stub[10] = 0x48; + stub[11] = 0x89; + stub[12] = 0xe0; + + /* movabsq $stack_bottom - 8, %rsp */ + stub[13] = 0x48; + stub[14] = 0xbc; + *(uint64_t *)&stub[15] = stack_bottom - 8; + + /* pushq %rax */ + stub[23] = 0x50; + + /* jmp target_va */ + stub[24] = 0xe9; + *(int32_t *)&stub[25] = target_va - (stub_va + 29); + + /* Round up to a multiple of 16 bytes. */ + return 32; +} + +DEFINE_PER_CPU(struct stubs, stubs); +void lstar_enter(void); +void cstar_enter(void); + +void pv_percpu_traps_init(void) +{ + unsigned long stack_bottom = get_stack_bottom(); + unsigned long stub_va = this_cpu(stubs.addr); + unsigned char *stub_page; + unsigned int offset; + + stub_page = map_domain_page(_mfn(this_cpu(stubs.mfn))); + + /* + * Trampoline for SYSCALL entry from 64-bit mode. The VT-x HVM vcpu + * context switch logic relies on the SYSCALL trampoline being at the + * start of the stubs. + */ + wrmsrl(MSR_LSTAR, stub_va); + offset = write_stub_trampoline(stub_page + (stub_va & ~PAGE_MASK), + stub_va, stack_bottom, + (unsigned long)lstar_enter); + stub_va += offset; + + if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL || + boot_cpu_data.x86_vendor == X86_VENDOR_CENTAUR ) + { + /* SYSENTER entry. */ + wrmsrl(MSR_IA32_SYSENTER_ESP, stack_bottom); + wrmsrl(MSR_IA32_SYSENTER_EIP, (unsigned long)sysenter_entry); + wrmsr(MSR_IA32_SYSENTER_CS, __HYPERVISOR_CS, 0); + } + + /* Trampoline for SYSCALL entry from compatibility mode. */ + wrmsrl(MSR_CSTAR, stub_va); + offset += write_stub_trampoline(stub_page + (stub_va & ~PAGE_MASK), + stub_va, stack_bottom, + (unsigned long)cstar_enter); + + /* Don't consume more than half of the stub space here. */ + ASSERT(offset <= STUB_BUF_SIZE / 2); + + unmap_domain_page(stub_page); + + /* Common SYSCALL parameters. */ + wrmsrl(MSR_STAR, XEN_MSR_STAR); + wrmsrl(MSR_SYSCALL_MASK, XEN_SYSCALL_MASK); +} + +void init_int80_direct_trap(struct vcpu *v) +{ + struct trap_info *ti = &v->arch.pv_vcpu.trap_ctxt[0x80]; + struct trap_bounce *tb = &v->arch.pv_vcpu.int80_bounce; + + tb->flags = TBF_EXCEPTION; + tb->cs = ti->cs; + tb->eip = ti->address; + + if ( null_trap_bounce(v, tb) ) + tb->flags = 0; +} + +static long register_guest_callback(struct callback_register *reg) +{ + long ret = 0; + struct vcpu *v = current; + + if ( !is_canonical_address(reg->address) ) + return -EINVAL; + + switch ( reg->type ) + { + case CALLBACKTYPE_event: + v->arch.pv_vcpu.event_callback_eip = reg->address; + break; + + case CALLBACKTYPE_failsafe: + v->arch.pv_vcpu.failsafe_callback_eip = reg->address; + if ( reg->flags & CALLBACKF_mask_events ) + set_bit(_VGCF_failsafe_disables_events, + &v->arch.vgc_flags); + else + clear_bit(_VGCF_failsafe_disables_events, + &v->arch.vgc_flags); + break; + + case CALLBACKTYPE_syscall: + v->arch.pv_vcpu.syscall_callback_eip = reg->address; + if ( reg->flags & CALLBACKF_mask_events ) + set_bit(_VGCF_syscall_disables_events, + &v->arch.vgc_flags); + else + clear_bit(_VGCF_syscall_disables_events, + &v->arch.vgc_flags); + break; + + case CALLBACKTYPE_syscall32: + v->arch.pv_vcpu.syscall32_callback_eip = reg->address; + v->arch.pv_vcpu.syscall32_disables_events = + !!(reg->flags & CALLBACKF_mask_events); + break; + + case CALLBACKTYPE_sysenter: + v->arch.pv_vcpu.sysenter_callback_eip = reg->address; + v->arch.pv_vcpu.sysenter_disables_events = + !!(reg->flags & CALLBACKF_mask_events); + break; + + case CALLBACKTYPE_nmi: + ret = register_guest_nmi_callback(reg->address); + break; + + default: + ret = -ENOSYS; + break; + } + + return ret; +} + +static long unregister_guest_callback(struct callback_unregister *unreg) +{ + long ret; + + switch ( unreg->type ) + { + case CALLBACKTYPE_event: + case CALLBACKTYPE_failsafe: + case CALLBACKTYPE_syscall: + case CALLBACKTYPE_syscall32: + case CALLBACKTYPE_sysenter: + ret = -EINVAL; + break; + + case CALLBACKTYPE_nmi: + ret = unregister_guest_nmi_callback(); + break; + + default: + ret = -ENOSYS; + break; + } + + return ret; +} + + +long do_callback_op(int cmd, XEN_GUEST_HANDLE_PARAM(const_void) arg) +{ + long ret; + + switch ( cmd ) + { + case CALLBACKOP_register: + { + struct callback_register reg; + + ret = -EFAULT; + if ( copy_from_guest(®, arg, 1) ) + break; + + ret = register_guest_callback(®); + } + break; + + case CALLBACKOP_unregister: + { + struct callback_unregister unreg; + + ret = -EFAULT; + if ( copy_from_guest(&unreg, arg, 1) ) + break; + + ret = unregister_guest_callback(&unreg); + } + break; + + default: + ret = -ENOSYS; + break; + } + + return ret; +} + +long do_set_callbacks(unsigned long event_address, + unsigned long failsafe_address, + unsigned long syscall_address) +{ + struct callback_register event = { + .type = CALLBACKTYPE_event, + .address = event_address, + }; + struct callback_register failsafe = { + .type = CALLBACKTYPE_failsafe, + .address = failsafe_address, + }; + struct callback_register syscall = { + .type = CALLBACKTYPE_syscall, + .address = syscall_address, + }; + + register_guest_callback(&event); + register_guest_callback(&failsafe); + register_guest_callback(&syscall); + + return 0; +} + +void hypercall_page_initialise_ring3_kernel(void *hypercall_page) +{ + char *p; + int i; + + /* Fill in all the transfer points with template machine code. */ + for ( i = 0; i < (PAGE_SIZE / 32); i++ ) + { + if ( i == __HYPERVISOR_iret ) + continue; + + p = (char *)(hypercall_page + (i * 32)); + *(u8 *)(p+ 0) = 0x51; /* push %rcx */ + *(u16 *)(p+ 1) = 0x5341; /* push %r11 */ + *(u8 *)(p+ 3) = 0xb8; /* mov $,%eax */ + *(u32 *)(p+ 4) = i; + *(u16 *)(p+ 8) = 0x050f; /* syscall */ + *(u16 *)(p+10) = 0x5b41; /* pop %r11 */ + *(u8 *)(p+12) = 0x59; /* pop %rcx */ + *(u8 *)(p+13) = 0xc3; /* ret */ + } + + /* + * HYPERVISOR_iret is special because it doesn't return and expects a + * special stack frame. Guests jump at this transfer point instead of + * calling it. + */ + p = (char *)(hypercall_page + (__HYPERVISOR_iret * 32)); + *(u8 *)(p+ 0) = 0x51; /* push %rcx */ + *(u16 *)(p+ 1) = 0x5341; /* push %r11 */ + *(u8 *)(p+ 3) = 0x50; /* push %rax */ + *(u8 *)(p+ 4) = 0xb8; /* mov $__HYPERVISOR_iret,%eax */ + *(u32 *)(p+ 5) = __HYPERVISOR_iret; + *(u16 *)(p+ 9) = 0x050f; /* syscall */ +} + /* * Local variables: * mode: C diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c index a237f4d5c2..2027a6a4ae 100644 --- a/xen/arch/x86/x86_64/traps.c +++ b/xen/arch/x86/x86_64/traps.c @@ -23,6 +23,8 @@ #include #include #include +#include +#include #include @@ -254,171 +256,6 @@ void do_double_fault(struct cpu_user_regs *regs) panic("DOUBLE FAULT -- system shutdown"); } -void toggle_guest_mode(struct vcpu *v) -{ - if ( is_pv_32bit_vcpu(v) ) - return; - if ( cpu_has_fsgsbase ) - { - if ( v->arch.flags & TF_kernel_mode ) - v->arch.pv_vcpu.gs_base_kernel = __rdgsbase(); - else - v->arch.pv_vcpu.gs_base_user = __rdgsbase(); - } - v->arch.flags ^= TF_kernel_mode; - asm volatile ( "swapgs" ); - update_cr3(v); - /* Don't flush user global mappings from the TLB. Don't tick TLB clock. */ - asm volatile ( "mov %0, %%cr3" : : "r" (v->arch.cr3) : "memory" ); - - if ( !(v->arch.flags & TF_kernel_mode) ) - return; - - if ( v->arch.pv_vcpu.need_update_runstate_area && - update_runstate_area(v) ) - v->arch.pv_vcpu.need_update_runstate_area = 0; - - if ( v->arch.pv_vcpu.pending_system_time.version && - update_secondary_system_time(v, - &v->arch.pv_vcpu.pending_system_time) ) - v->arch.pv_vcpu.pending_system_time.version = 0; -} - -unsigned long do_iret(void) -{ - struct cpu_user_regs *regs = guest_cpu_user_regs(); - struct iret_context iret_saved; - struct vcpu *v = current; - - if ( unlikely(copy_from_user(&iret_saved, (void *)regs->rsp, - sizeof(iret_saved))) ) - { - gprintk(XENLOG_ERR, - "Fault while reading IRET context from guest stack\n"); - goto exit_and_crash; - } - - /* Returning to user mode? */ - if ( (iret_saved.cs & 3) == 3 ) - { - if ( unlikely(pagetable_is_null(v->arch.guest_table_user)) ) - { - gprintk(XENLOG_ERR, - "Guest switching to user mode with no user page tables\n"); - goto exit_and_crash; - } - toggle_guest_mode(v); - } - - if ( VM_ASSIST(v->domain, architectural_iopl) ) - v->arch.pv_vcpu.iopl = iret_saved.rflags & X86_EFLAGS_IOPL; - - regs->rip = iret_saved.rip; - regs->cs = iret_saved.cs | 3; /* force guest privilege */ - regs->rflags = ((iret_saved.rflags & ~(X86_EFLAGS_IOPL|X86_EFLAGS_VM)) - | X86_EFLAGS_IF); - regs->rsp = iret_saved.rsp; - regs->ss = iret_saved.ss | 3; /* force guest privilege */ - - if ( !(iret_saved.flags & VGCF_in_syscall) ) - { - regs->entry_vector &= ~TRAP_syscall; - regs->r11 = iret_saved.r11; - regs->rcx = iret_saved.rcx; - } - - /* Restore upcall mask from supplied EFLAGS.IF. */ - vcpu_info(v, evtchn_upcall_mask) = !(iret_saved.rflags & X86_EFLAGS_IF); - - async_exception_cleanup(v); - - /* Saved %rax gets written back to regs->rax in entry.S. */ - return iret_saved.rax; - - exit_and_crash: - domain_crash(v->domain); - return 0; -} - -static unsigned int write_stub_trampoline( - unsigned char *stub, unsigned long stub_va, - unsigned long stack_bottom, unsigned long target_va) -{ - /* movabsq %rax, stack_bottom - 8 */ - stub[0] = 0x48; - stub[1] = 0xa3; - *(uint64_t *)&stub[2] = stack_bottom - 8; - - /* movq %rsp, %rax */ - stub[10] = 0x48; - stub[11] = 0x89; - stub[12] = 0xe0; - - /* movabsq $stack_bottom - 8, %rsp */ - stub[13] = 0x48; - stub[14] = 0xbc; - *(uint64_t *)&stub[15] = stack_bottom - 8; - - /* pushq %rax */ - stub[23] = 0x50; - - /* jmp target_va */ - stub[24] = 0xe9; - *(int32_t *)&stub[25] = target_va - (stub_va + 29); - - /* Round up to a multiple of 16 bytes. */ - return 32; -} - -DEFINE_PER_CPU(struct stubs, stubs); -void lstar_enter(void); -void cstar_enter(void); - -static void pv_percpu_traps_init(void) -{ - unsigned long stack_bottom = get_stack_bottom(); - unsigned long stub_va = this_cpu(stubs.addr); - unsigned char *stub_page; - unsigned int offset; - - stub_page = map_domain_page(_mfn(this_cpu(stubs.mfn))); - - /* - * Trampoline for SYSCALL entry from 64-bit mode. The VT-x HVM vcpu - * context switch logic relies on the SYSCALL trampoline being at the - * start of the stubs. - */ - wrmsrl(MSR_LSTAR, stub_va); - offset = write_stub_trampoline(stub_page + (stub_va & ~PAGE_MASK), - stub_va, stack_bottom, - (unsigned long)lstar_enter); - stub_va += offset; - - if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL || - boot_cpu_data.x86_vendor == X86_VENDOR_CENTAUR ) - { - /* SYSENTER entry. */ - wrmsrl(MSR_IA32_SYSENTER_ESP, stack_bottom); - wrmsrl(MSR_IA32_SYSENTER_EIP, (unsigned long)sysenter_entry); - wrmsr(MSR_IA32_SYSENTER_CS, __HYPERVISOR_CS, 0); - } - - /* Trampoline for SYSCALL entry from compatibility mode. */ - wrmsrl(MSR_CSTAR, stub_va); - offset += write_stub_trampoline(stub_page + (stub_va & ~PAGE_MASK), - stub_va, stack_bottom, - (unsigned long)cstar_enter); - - /* Don't consume more than half of the stub space here. */ - ASSERT(offset <= STUB_BUF_SIZE / 2); - - unmap_domain_page(stub_page); - - /* Common SYSCALL parameters. */ - wrmsrl(MSR_STAR, XEN_MSR_STAR); - wrmsrl(MSR_SYSCALL_MASK, XEN_SYSCALL_MASK); -} - void subarch_percpu_traps_init(void) { /* IST_MAX IST pages + 1 syscall page + 1 guard page + primary stack. */ @@ -427,202 +264,6 @@ void subarch_percpu_traps_init(void) pv_percpu_traps_init(); } -void init_int80_direct_trap(struct vcpu *v) -{ - struct trap_info *ti = &v->arch.pv_vcpu.trap_ctxt[0x80]; - struct trap_bounce *tb = &v->arch.pv_vcpu.int80_bounce; - - tb->flags = TBF_EXCEPTION; - tb->cs = ti->cs; - tb->eip = ti->address; - - if ( null_trap_bounce(v, tb) ) - tb->flags = 0; -} - -static long register_guest_callback(struct callback_register *reg) -{ - long ret = 0; - struct vcpu *v = current; - - if ( !is_canonical_address(reg->address) ) - return -EINVAL; - - switch ( reg->type ) - { - case CALLBACKTYPE_event: - v->arch.pv_vcpu.event_callback_eip = reg->address; - break; - - case CALLBACKTYPE_failsafe: - v->arch.pv_vcpu.failsafe_callback_eip = reg->address; - if ( reg->flags & CALLBACKF_mask_events ) - set_bit(_VGCF_failsafe_disables_events, - &v->arch.vgc_flags); - else - clear_bit(_VGCF_failsafe_disables_events, - &v->arch.vgc_flags); - break; - - case CALLBACKTYPE_syscall: - v->arch.pv_vcpu.syscall_callback_eip = reg->address; - if ( reg->flags & CALLBACKF_mask_events ) - set_bit(_VGCF_syscall_disables_events, - &v->arch.vgc_flags); - else - clear_bit(_VGCF_syscall_disables_events, - &v->arch.vgc_flags); - break; - - case CALLBACKTYPE_syscall32: - v->arch.pv_vcpu.syscall32_callback_eip = reg->address; - v->arch.pv_vcpu.syscall32_disables_events = - !!(reg->flags & CALLBACKF_mask_events); - break; - - case CALLBACKTYPE_sysenter: - v->arch.pv_vcpu.sysenter_callback_eip = reg->address; - v->arch.pv_vcpu.sysenter_disables_events = - !!(reg->flags & CALLBACKF_mask_events); - break; - - case CALLBACKTYPE_nmi: - ret = register_guest_nmi_callback(reg->address); - break; - - default: - ret = -ENOSYS; - break; - } - - return ret; -} - -static long unregister_guest_callback(struct callback_unregister *unreg) -{ - long ret; - - switch ( unreg->type ) - { - case CALLBACKTYPE_event: - case CALLBACKTYPE_failsafe: - case CALLBACKTYPE_syscall: - case CALLBACKTYPE_syscall32: - case CALLBACKTYPE_sysenter: - ret = -EINVAL; - break; - - case CALLBACKTYPE_nmi: - ret = unregister_guest_nmi_callback(); - break; - - default: - ret = -ENOSYS; - break; - } - - return ret; -} - - -long do_callback_op(int cmd, XEN_GUEST_HANDLE_PARAM(const_void) arg) -{ - long ret; - - switch ( cmd ) - { - case CALLBACKOP_register: - { - struct callback_register reg; - - ret = -EFAULT; - if ( copy_from_guest(®, arg, 1) ) - break; - - ret = register_guest_callback(®); - } - break; - - case CALLBACKOP_unregister: - { - struct callback_unregister unreg; - - ret = -EFAULT; - if ( copy_from_guest(&unreg, arg, 1) ) - break; - - ret = unregister_guest_callback(&unreg); - } - break; - - default: - ret = -ENOSYS; - break; - } - - return ret; -} - -long do_set_callbacks(unsigned long event_address, - unsigned long failsafe_address, - unsigned long syscall_address) -{ - struct callback_register event = { - .type = CALLBACKTYPE_event, - .address = event_address, - }; - struct callback_register failsafe = { - .type = CALLBACKTYPE_failsafe, - .address = failsafe_address, - }; - struct callback_register syscall = { - .type = CALLBACKTYPE_syscall, - .address = syscall_address, - }; - - register_guest_callback(&event); - register_guest_callback(&failsafe); - register_guest_callback(&syscall); - - return 0; -} - -static void hypercall_page_initialise_ring3_kernel(void *hypercall_page) -{ - char *p; - int i; - - /* Fill in all the transfer points with template machine code. */ - for ( i = 0; i < (PAGE_SIZE / 32); i++ ) - { - if ( i == __HYPERVISOR_iret ) - continue; - - p = (char *)(hypercall_page + (i * 32)); - *(u8 *)(p+ 0) = 0x51; /* push %rcx */ - *(u16 *)(p+ 1) = 0x5341; /* push %r11 */ - *(u8 *)(p+ 3) = 0xb8; /* mov $,%eax */ - *(u32 *)(p+ 4) = i; - *(u16 *)(p+ 8) = 0x050f; /* syscall */ - *(u16 *)(p+10) = 0x5b41; /* pop %r11 */ - *(u8 *)(p+12) = 0x59; /* pop %rcx */ - *(u8 *)(p+13) = 0xc3; /* ret */ - } - - /* - * HYPERVISOR_iret is special because it doesn't return and expects a - * special stack frame. Guests jump at this transfer point instead of - * calling it. - */ - p = (char *)(hypercall_page + (__HYPERVISOR_iret * 32)); - *(u8 *)(p+ 0) = 0x51; /* push %rcx */ - *(u16 *)(p+ 1) = 0x5341; /* push %r11 */ - *(u8 *)(p+ 3) = 0x50; /* push %rax */ - *(u8 *)(p+ 4) = 0xb8; /* mov $__HYPERVISOR_iret,%eax */ - *(u32 *)(p+ 5) = __HYPERVISOR_iret; - *(u16 *)(p+ 9) = 0x050f; /* syscall */ -} - #include "compat/traps.c" void hypercall_page_initialise(struct domain *d, void *hypercall_page) diff --git a/xen/include/asm-x86/pv/domain.h b/xen/include/asm-x86/pv/domain.h index acdf140fbd..dfa60b080c 100644 --- a/xen/include/asm-x86/pv/domain.h +++ b/xen/include/asm-x86/pv/domain.h @@ -29,6 +29,8 @@ void pv_domain_destroy(struct domain *d); int pv_domain_initialise(struct domain *d, unsigned int domcr_flags, struct xen_arch_domainconfig *config); +void hypercall_page_initialise_ring3_kernel(void *hypercall_page); + #else /* !CONFIG_PV */ #include @@ -42,6 +44,9 @@ static inline int pv_domain_initialise(struct domain *d, { return -EOPNOTSUPP; } + +void hypercall_page_initialise_ring3_kernel(void *hypercall_page) {} + #endif /* CONFIG_PV */ void paravirt_ctxt_switch_from(struct vcpu *v); diff --git a/xen/include/asm-x86/pv/traps.h b/xen/include/asm-x86/pv/traps.h index f41287add7..43d9112b6d 100644 --- a/xen/include/asm-x86/pv/traps.h +++ b/xen/include/asm-x86/pv/traps.h @@ -30,6 +30,8 @@ void emulate_gate_op(struct cpu_user_regs *regs); int emulate_forced_invalid_op(struct cpu_user_regs *regs); int emulate_invalid_rdtscp(struct cpu_user_regs *regs); +void pv_percpu_traps_init(void); + #else /* !CONFIG_PV */ #include @@ -39,6 +41,8 @@ void emulate_gate_op(struct cpu_user_regs *regs) {} int emulate_forced_invalid_op(struct cpu_user_regs *regs) { return -EOPNOTSUPP; } int emulate_invalid_rdtscp(struct cpu_user_regs *regs) { return -EOPNOTSUPP; } +void pv_percpu_traps_init(void) {} + #endif /* CONFIG_PV */ #endif /* __X86_PV_TRAPS_H__ */