diff mbox series

[RFC,v6,74/92] kvm: x86: do not unconditionally patch the hypercall instruction during emulation

Message ID 20190809160047.8319-75-alazar@bitdefender.com (mailing list archive)
State New, archived
Headers show
Series VM introspection | expand

Commit Message

Adalbert Lazăr Aug. 9, 2019, 4 p.m. UTC
From: Mihai Donțu <mdontu@bitdefender.com>

It can happened for us to end up emulating the VMCALL instruction as a
result of the handling of an EPT write fault. In this situation, the
emulator will try to unconditionally patch the correct hypercall opcode
bytes using emulator_write_emulated(). However, this last call uses the
fault GPA (if available) or walks the guest page tables at RIP,
otherwise. The trouble begins when using KVMI, when we forbid the use of
the fault GPA and fallback to the guest pt walk: in Windows (8.1 and
newer) the page that we try to write into is marked read-execute and as
such emulator_write_emulated() fails and we inject a write #PF, leading
to a guest crash.

The fix is rather simple: check the existing instruction bytes before
doing the patching. This does not change the normal KVM behaviour, but
does help when using KVMI as we no longer inject a write #PF.

CC: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Mihai Donțu <mdontu@bitdefender.com>
Signed-off-by: Adalbert Lazăr <alazar@bitdefender.com>
---
 arch/x86/kvm/x86.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

Comments

Paolo Bonzini Aug. 13, 2019, 9:20 a.m. UTC | #1
On 09/08/19 18:00, Adalbert Lazăr wrote:
> From: Mihai Donțu <mdontu@bitdefender.com>
> 
> It can happened for us to end up emulating the VMCALL instruction as a
> result of the handling of an EPT write fault. In this situation, the
> emulator will try to unconditionally patch the correct hypercall opcode
> bytes using emulator_write_emulated(). However, this last call uses the
> fault GPA (if available) or walks the guest page tables at RIP,
> otherwise. The trouble begins when using KVMI, when we forbid the use of
> the fault GPA and fallback to the guest pt walk: in Windows (8.1 and
> newer) the page that we try to write into is marked read-execute and as
> such emulator_write_emulated() fails and we inject a write #PF, leading
> to a guest crash.
> 
> The fix is rather simple: check the existing instruction bytes before
> doing the patching. This does not change the normal KVM behaviour, but
> does help when using KVMI as we no longer inject a write #PF.

Fixing the hypercall is just an optimization.  Can we just hush and
return to the guest if emulator_write_emulated returns
X86EMUL_PROPAGATE_FAULT?

Paolo

> CC: Joerg Roedel <joro@8bytes.org>
> Signed-off-by: Mihai Donțu <mdontu@bitdefender.com>
> Signed-off-by: Adalbert Lazăr <alazar@bitdefender.com>
> ---
>  arch/x86/kvm/x86.c | 23 ++++++++++++++++++++---
>  1 file changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 04b1d2916a0a..965c4f0108eb 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7363,16 +7363,33 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>  }
>  EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);
>  
> +#define KVM_HYPERCALL_INSN_LEN 3
> +
>  static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
>  {
> +	int err;
>  	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
> -	char instruction[3];
> +	char buf[KVM_HYPERCALL_INSN_LEN];
> +	char instruction[KVM_HYPERCALL_INSN_LEN];
>  	unsigned long rip = kvm_rip_read(vcpu);
>  
> +	err = emulator_read_emulated(ctxt, rip, buf, sizeof(buf),
> +				     &ctxt->exception);
> +	if (err != X86EMUL_CONTINUE)
> +		return err;
> +
>  	kvm_x86_ops->patch_hypercall(vcpu, instruction);
> +	if (!memcmp(instruction, buf, sizeof(instruction)))
> +		/*
> +		 * The hypercall instruction is the correct one. Retry
> +		 * its execution maybe we got here as a result of an
> +		 * event other than #UD which has been resolved in the
> +		 * mean time.
> +		 */
> +		return X86EMUL_CONTINUE;
>  
> -	return emulator_write_emulated(ctxt, rip, instruction, 3,
> -		&ctxt->exception);
> +	return emulator_write_emulated(ctxt, rip, instruction,
> +				       sizeof(instruction), &ctxt->exception);
>  }
>  
>  static int dm_request_for_irq_injection(struct kvm_vcpu *vcpu)
>
Adalbert Lazăr Aug. 14, 2019, 12:07 p.m. UTC | #2
On Tue, 13 Aug 2019 11:20:45 +0200, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 09/08/19 18:00, Adalbert Lazăr wrote:
> > From: Mihai Donțu <mdontu@bitdefender.com>
> > 
> > It can happened for us to end up emulating the VMCALL instruction as a
> > result of the handling of an EPT write fault. In this situation, the
> > emulator will try to unconditionally patch the correct hypercall opcode
> > bytes using emulator_write_emulated(). However, this last call uses the
> > fault GPA (if available) or walks the guest page tables at RIP,
> > otherwise. The trouble begins when using KVMI, when we forbid the use of
> > the fault GPA and fallback to the guest pt walk: in Windows (8.1 and
> > newer) the page that we try to write into is marked read-execute and as
> > such emulator_write_emulated() fails and we inject a write #PF, leading
> > to a guest crash.
> > 
> > The fix is rather simple: check the existing instruction bytes before
> > doing the patching. This does not change the normal KVM behaviour, but
> > does help when using KVMI as we no longer inject a write #PF.
> 
> Fixing the hypercall is just an optimization.  Can we just hush and
> return to the guest if emulator_write_emulated returns
> X86EMUL_PROPAGATE_FAULT?
> 
> Paolo

Something like this?

	err = emulator_write_emulated(...);
	if (err == X86EMUL_PROPAGATE_FAULT)
		err = X86EMUL_CONTINUE;
	return err;

> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 04b1d2916a0a..965c4f0108eb 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -7363,16 +7363,33 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);
> >  
> > +#define KVM_HYPERCALL_INSN_LEN 3
> > +
> >  static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
> >  {
> > +	int err;
> >  	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
> > -	char instruction[3];
> > +	char buf[KVM_HYPERCALL_INSN_LEN];
> > +	char instruction[KVM_HYPERCALL_INSN_LEN];
> >  	unsigned long rip = kvm_rip_read(vcpu);
> >  
> > +	err = emulator_read_emulated(ctxt, rip, buf, sizeof(buf),
> > +				     &ctxt->exception);
> > +	if (err != X86EMUL_CONTINUE)
> > +		return err;
> > +
> >  	kvm_x86_ops->patch_hypercall(vcpu, instruction);
> > +	if (!memcmp(instruction, buf, sizeof(instruction)))
> > +		/*
> > +		 * The hypercall instruction is the correct one. Retry
> > +		 * its execution maybe we got here as a result of an
> > +		 * event other than #UD which has been resolved in the
> > +		 * mean time.
> > +		 */
> > +		return X86EMUL_CONTINUE;
> >  
> > -	return emulator_write_emulated(ctxt, rip, instruction, 3,
> > -		&ctxt->exception);
> > +	return emulator_write_emulated(ctxt, rip, instruction,
> > +				       sizeof(instruction), &ctxt->exception);
> >  }
Paolo Bonzini Aug. 14, 2019, 12:33 p.m. UTC | #3
On 14/08/19 14:07, Adalbert Lazăr wrote:
> On Tue, 13 Aug 2019 11:20:45 +0200, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 09/08/19 18:00, Adalbert Lazăr wrote:
>>> From: Mihai Donțu <mdontu@bitdefender.com>
>>>
>>> It can happened for us to end up emulating the VMCALL instruction as a
>>> result of the handling of an EPT write fault. In this situation, the
>>> emulator will try to unconditionally patch the correct hypercall opcode
>>> bytes using emulator_write_emulated(). However, this last call uses the
>>> fault GPA (if available) or walks the guest page tables at RIP,
>>> otherwise. The trouble begins when using KVMI, when we forbid the use of
>>> the fault GPA and fallback to the guest pt walk: in Windows (8.1 and
>>> newer) the page that we try to write into is marked read-execute and as
>>> such emulator_write_emulated() fails and we inject a write #PF, leading
>>> to a guest crash.
>>>
>>> The fix is rather simple: check the existing instruction bytes before
>>> doing the patching. This does not change the normal KVM behaviour, but
>>> does help when using KVMI as we no longer inject a write #PF.
>>
>> Fixing the hypercall is just an optimization.  Can we just hush and
>> return to the guest if emulator_write_emulated returns
>> X86EMUL_PROPAGATE_FAULT?
>>
>> Paolo
> 
> Something like this?
> 
> 	err = emulator_write_emulated(...);
> 	if (err == X86EMUL_PROPAGATE_FAULT)
> 		err = X86EMUL_CONTINUE;
> 	return err;

Yes.  The only difference will be that you'll keep getting #UD vmexits
instead of hypercall vmexits.  It's also safer, we want to obey those
r-x permissions because PatchGuard would crash the system if it noticed
the rewriting for whatever reason.

Paolo

>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 04b1d2916a0a..965c4f0108eb 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -7363,16 +7363,33 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>>>  }
>>>  EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);
>>>  
>>> +#define KVM_HYPERCALL_INSN_LEN 3
>>> +
>>>  static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
>>>  {
>>> +	int err;
>>>  	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
>>> -	char instruction[3];
>>> +	char buf[KVM_HYPERCALL_INSN_LEN];
>>> +	char instruction[KVM_HYPERCALL_INSN_LEN];
>>>  	unsigned long rip = kvm_rip_read(vcpu);
>>>  
>>> +	err = emulator_read_emulated(ctxt, rip, buf, sizeof(buf),
>>> +				     &ctxt->exception);
>>> +	if (err != X86EMUL_CONTINUE)
>>> +		return err;
>>> +
>>>  	kvm_x86_ops->patch_hypercall(vcpu, instruction);
>>> +	if (!memcmp(instruction, buf, sizeof(instruction)))
>>> +		/*
>>> +		 * The hypercall instruction is the correct one. Retry
>>> +		 * its execution maybe we got here as a result of an
>>> +		 * event other than #UD which has been resolved in the
>>> +		 * mean time.
>>> +		 */
>>> +		return X86EMUL_CONTINUE;
>>>  
>>> -	return emulator_write_emulated(ctxt, rip, instruction, 3,
>>> -		&ctxt->exception);
>>> +	return emulator_write_emulated(ctxt, rip, instruction,
>>> +				       sizeof(instruction), &ctxt->exception);
>>>  }
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 04b1d2916a0a..965c4f0108eb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7363,16 +7363,33 @@  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);
 
+#define KVM_HYPERCALL_INSN_LEN 3
+
 static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
 {
+	int err;
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
-	char instruction[3];
+	char buf[KVM_HYPERCALL_INSN_LEN];
+	char instruction[KVM_HYPERCALL_INSN_LEN];
 	unsigned long rip = kvm_rip_read(vcpu);
 
+	err = emulator_read_emulated(ctxt, rip, buf, sizeof(buf),
+				     &ctxt->exception);
+	if (err != X86EMUL_CONTINUE)
+		return err;
+
 	kvm_x86_ops->patch_hypercall(vcpu, instruction);
+	if (!memcmp(instruction, buf, sizeof(instruction)))
+		/*
+		 * The hypercall instruction is the correct one. Retry
+		 * its execution maybe we got here as a result of an
+		 * event other than #UD which has been resolved in the
+		 * mean time.
+		 */
+		return X86EMUL_CONTINUE;
 
-	return emulator_write_emulated(ctxt, rip, instruction, 3,
-		&ctxt->exception);
+	return emulator_write_emulated(ctxt, rip, instruction,
+				       sizeof(instruction), &ctxt->exception);
 }
 
 static int dm_request_for_irq_injection(struct kvm_vcpu *vcpu)