diff mbox series

[1/2] x86/vtx: Fix fault semantics for early task switch failures

Message ID 20191121221551.1175-2-andrew.cooper3@citrix.com (mailing list archive)
State Superseded
Headers show
Series x86/hvm: Multiple corrections to task switch handling | expand

Commit Message

Andrew Cooper Nov. 21, 2019, 10:15 p.m. UTC
The VT-x task switch handler adds inst_len to rip before calling
hvm_task_switch().  This causes early faults to be delivered to the guest with
trap semantics, and break restartibility.

Instead, pass the instruction length into hvm_task_switch() and write it into
the outgoing tss only, leaving rip in its original location.

For now, pass 0 on the SVM side.  This highlights a separate preexisting bug
which will be addressed in the following patch.

While adjusting call sites, drop the unnecessary uint16_t cast.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wl@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Juergen Gross <jgross@suse.com>
---
 xen/arch/x86/hvm/hvm.c        | 4 ++--
 xen/arch/x86/hvm/svm/svm.c    | 2 +-
 xen/arch/x86/hvm/vmx/vmx.c    | 4 ++--
 xen/include/asm-x86/hvm/hvm.h | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

Comments

Roger Pau Monné Nov. 22, 2019, 12:37 p.m. UTC | #1
On Thu, Nov 21, 2019 at 10:15:50PM +0000, Andrew Cooper wrote:
> The VT-x task switch handler adds inst_len to rip before calling
> hvm_task_switch().  This causes early faults to be delivered to the guest with

By early faults you mean faults injected by hvm_task_switch itself for
example?

> trap semantics, and break restartibility.
> 
> Instead, pass the instruction length into hvm_task_switch() and write it into
> the outgoing tss only, leaving rip in its original location.
> 
> For now, pass 0 on the SVM side.  This highlights a separate preexisting bug
> which will be addressed in the following patch.
> 
> While adjusting call sites, drop the unnecessary uint16_t cast.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Code LGTM:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.
Andrew Cooper Nov. 22, 2019, 12:43 p.m. UTC | #2
On 22/11/2019 12:37, Roger Pau Monné wrote:
> On Thu, Nov 21, 2019 at 10:15:50PM +0000, Andrew Cooper wrote:
>> The VT-x task switch handler adds inst_len to rip before calling
>> hvm_task_switch().  This causes early faults to be delivered to the guest with
> By early faults you mean faults injected by hvm_task_switch itself for
> example?

A task switch is restartable up until a point.  Beyond that point any
chaos will reign in the new task, not the old task.

By "early", I mean any fault which is handled in the context of the old
task.  As far as testing goes, I think mapping the current TSS as
read-only is about the only way I've got causing this to occur, because
all other fault conditions are checked by the processor before issuing a
TASK_SWITCH VMExit.

>
>> trap semantics, and break restartibility.
>>
>> Instead, pass the instruction length into hvm_task_switch() and write it into
>> the outgoing tss only, leaving rip in its original location.
>>
>> For now, pass 0 on the SVM side.  This highlights a separate preexisting bug
>> which will be addressed in the following patch.
>>
>> While adjusting call sites, drop the unnecessary uint16_t cast.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Code LGTM:
>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks,

~Andrew
Jan Beulich Nov. 22, 2019, 1:08 p.m. UTC | #3
On 22.11.2019 13:37, Roger Pau Monné  wrote:
> On Thu, Nov 21, 2019 at 10:15:50PM +0000, Andrew Cooper wrote:
>> The VT-x task switch handler adds inst_len to rip before calling
>> hvm_task_switch().  This causes early faults to be delivered to the guest with
>> trap semantics, and break restartibility.
>>
>> Instead, pass the instruction length into hvm_task_switch() and write it into
>> the outgoing tss only, leaving rip in its original location.
>>
>> For now, pass 0 on the SVM side.  This highlights a separate preexisting bug
>> which will be addressed in the following patch.
>>
>> While adjusting call sites, drop the unnecessary uint16_t cast.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> Code LGTM:
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper Nov. 22, 2019, 1:12 p.m. UTC | #4
On 22/11/2019 13:08, Jan Beulich wrote:
> On 22.11.2019 13:37, Roger Pau Monné  wrote:
>> On Thu, Nov 21, 2019 at 10:15:50PM +0000, Andrew Cooper wrote:
>>> The VT-x task switch handler adds inst_len to rip before calling
>>> hvm_task_switch().  This causes early faults to be delivered to the guest with
>>> trap semantics, and break restartibility.
>>>
>>> Instead, pass the instruction length into hvm_task_switch() and write it into
>>> the outgoing tss only, leaving rip in its original location.
>>>
>>> For now, pass 0 on the SVM side.  This highlights a separate preexisting bug
>>> which will be addressed in the following patch.
>>>
>>> While adjusting call sites, drop the unnecessary uint16_t cast.
>>>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Code LGTM:
>>
>> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>

It occurs to me that this also fixes a vmentry failure in the corner
case that an instruction, which crosses the 4G=>0 boundary takes a
fault.  %rip will be adjusted without being truncated.

~Andrew
Jan Beulich Nov. 22, 2019, 1:39 p.m. UTC | #5
On 22.11.2019 14:12, Andrew Cooper wrote:
> On 22/11/2019 13:08, Jan Beulich wrote:
>> On 22.11.2019 13:37, Roger Pau Monné  wrote:
>>> On Thu, Nov 21, 2019 at 10:15:50PM +0000, Andrew Cooper wrote:
>>>> The VT-x task switch handler adds inst_len to rip before calling
>>>> hvm_task_switch().  This causes early faults to be delivered to the guest with
>>>> trap semantics, and break restartibility.
>>>>
>>>> Instead, pass the instruction length into hvm_task_switch() and write it into
>>>> the outgoing tss only, leaving rip in its original location.
>>>>
>>>> For now, pass 0 on the SVM side.  This highlights a separate preexisting bug
>>>> which will be addressed in the following patch.
>>>>
>>>> While adjusting call sites, drop the unnecessary uint16_t cast.
>>>>
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Code LGTM:
>>>
>>> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>> Acked-by: Jan Beulich <jbeulich@suse.com>
> 
> It occurs to me that this also fixes a vmentry failure in the corner
> case that an instruction, which crosses the 4G=>0 boundary takes a
> fault.  %rip will be adjusted without being truncated.

I was about to say so in my earlier reply, until I paid attention
to this

@@ -2987,7 +2987,7 @@ void hvm_task_switch(
     if ( taskswitch_reason == TSW_iret )
         eflags &= ~X86_EFLAGS_NT;
 
-    tss.eip    = regs->eip;
+    tss.eip    = regs->eip + insn_len;

together with the subsequent

    regs->rip    = tss.eip;

already having taken care of this aspect before, afaict.

Jan
Andrew Cooper Nov. 22, 2019, 2:51 p.m. UTC | #6
On 22/11/2019 13:39, Jan Beulich wrote:
> On 22.11.2019 14:12, Andrew Cooper wrote:
>> On 22/11/2019 13:08, Jan Beulich wrote:
>>> On 22.11.2019 13:37, Roger Pau Monné  wrote:
>>>> On Thu, Nov 21, 2019 at 10:15:50PM +0000, Andrew Cooper wrote:
>>>>> The VT-x task switch handler adds inst_len to rip before calling
>>>>> hvm_task_switch().  This causes early faults to be delivered to the guest with
>>>>> trap semantics, and break restartibility.
>>>>>
>>>>> Instead, pass the instruction length into hvm_task_switch() and write it into
>>>>> the outgoing tss only, leaving rip in its original location.
>>>>>
>>>>> For now, pass 0 on the SVM side.  This highlights a separate preexisting bug
>>>>> which will be addressed in the following patch.
>>>>>
>>>>> While adjusting call sites, drop the unnecessary uint16_t cast.
>>>>>
>>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> Code LGTM:
>>>>
>>>> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>>> Acked-by: Jan Beulich <jbeulich@suse.com>
>> It occurs to me that this also fixes a vmentry failure in the corner
>> case that an instruction, which crosses the 4G=>0 boundary takes a
>> fault.  %rip will be adjusted without being truncated.
> I was about to say so in my earlier reply, until I paid attention
> to this
>
> @@ -2987,7 +2987,7 @@ void hvm_task_switch(
>      if ( taskswitch_reason == TSW_iret )
>          eflags &= ~X86_EFLAGS_NT;
>  
> -    tss.eip    = regs->eip;
> +    tss.eip    = regs->eip + insn_len;
>
> together with the subsequent
>
>     regs->rip    = tss.eip;
>
> already having taken care of this aspect before, afaict.

This takes care of things for a task switch which completes
successfully, but not for one which faulted (and ended up delivering
with trap semantics).  In that case, the (now deleted) regs->rip +=
inst_len; would end up un-truncated.

~Andrew
Tian, Kevin Nov. 25, 2019, 8:23 a.m. UTC | #7
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Friday, November 22, 2019 6:16 AM
> 
> The VT-x task switch handler adds inst_len to rip before calling
> hvm_task_switch().  This causes early faults to be delivered to the guest
> with
> trap semantics, and break restartibility.
> 
> Instead, pass the instruction length into hvm_task_switch() and write it into
> the outgoing tss only, leaving rip in its original location.
> 
> For now, pass 0 on the SVM side.  This highlights a separate preexisting bug
> which will be addressed in the following patch.
> 
> While adjusting call sites, drop the unnecessary uint16_t cast.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 818e705fd1..7f556171bd 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2913,7 +2913,7 @@  void hvm_prepare_vm86_tss(struct vcpu *v, uint32_t base, uint32_t limit)
 
 void hvm_task_switch(
     uint16_t tss_sel, enum hvm_task_switch_reason taskswitch_reason,
-    int32_t errcode)
+    int32_t errcode, unsigned int insn_len)
 {
     struct vcpu *v = current;
     struct cpu_user_regs *regs = guest_cpu_user_regs();
@@ -2987,7 +2987,7 @@  void hvm_task_switch(
     if ( taskswitch_reason == TSW_iret )
         eflags &= ~X86_EFLAGS_NT;
 
-    tss.eip    = regs->eip;
+    tss.eip    = regs->eip + insn_len;
     tss.eflags = eflags;
     tss.eax    = regs->eax;
     tss.ecx    = regs->ecx;
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 4eb6b0e4c7..049b800e20 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2794,7 +2794,7 @@  void svm_vmexit_handler(struct cpu_user_regs *regs)
          */
         vmcb->eventinj.bytes = 0;
 
-        hvm_task_switch((uint16_t)vmcb->exitinfo1, reason, errcode);
+        hvm_task_switch(vmcb->exitinfo1, reason, errcode, 0);
         break;
     }
 
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 6a5eeb5c13..6d048852c3 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3956,8 +3956,8 @@  void vmx_vmexit_handler(struct cpu_user_regs *regs)
             __vmread(IDT_VECTORING_ERROR_CODE, &ecode);
         else
              ecode = -1;
-        regs->rip += inst_len;
-        hvm_task_switch((uint16_t)exit_qualification, reasons[source], ecode);
+
+        hvm_task_switch(exit_qualification, reasons[source], ecode, inst_len);
         break;
     }
     case EXIT_REASON_CPUID:
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index f86af09898..4cce59bb31 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -297,7 +297,7 @@  void hvm_set_rdtsc_exiting(struct domain *d, bool_t enable);
 enum hvm_task_switch_reason { TSW_jmp, TSW_iret, TSW_call_or_int };
 void hvm_task_switch(
     uint16_t tss_sel, enum hvm_task_switch_reason taskswitch_reason,
-    int32_t errcode);
+    int32_t errcode, unsigned int insn_len);
 
 enum hvm_access_type {
     hvm_access_insn_fetch,