diff mbox series

KVM: nVMX: Inject #GP when nested_vmx_get_vmptr() fails to read guest memory

Message ID 20200604143158.484651-1-vkuznets@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: nVMX: Inject #GP when nested_vmx_get_vmptr() fails to read guest memory | expand

Commit Message

Vitaly Kuznetsov June 4, 2020, 2:31 p.m. UTC
Syzbot reports the following issue:

WARNING: CPU: 0 PID: 6819 at arch/x86/kvm/x86.c:618 kvm_inject_emulated_page_fault+0x210/0x290 arch/x86/kvm/x86.c:618
...
Call Trace:
...
RIP: 0010:kvm_inject_emulated_page_fault+0x210/0x290 arch/x86/kvm/x86.c:618
...
 nested_vmx_get_vmptr+0x1f9/0x2a0 arch/x86/kvm/vmx/nested.c:4638
 handle_vmon arch/x86/kvm/vmx/nested.c:4767 [inline]
 handle_vmon+0x168/0x3a0 arch/x86/kvm/vmx/nested.c:4728
 vmx_handle_exit+0x29c/0x1260 arch/x86/kvm/vmx/vmx.c:6067

'exception' we're trying to inject with kvm_inject_emulated_page_fault() comes from
  nested_vmx_get_vmptr()
   kvm_read_guest_virt()
     kvm_read_guest_virt_helper()
       vcpu->arch.walk_mmu->gva_to_gpa()

but it is only set when GVA to GPA conversion fails. In case it doesn't but
we still fail kvm_vcpu_read_guest_page(), X86EMUL_IO_NEEDED is returned and
nested_vmx_get_vmptr() calls kvm_inject_emulated_page_fault() with zeroed
'exception'. This happen when e.g. VMXON/VMPTRLD/VMCLEAR argument is MMIO.

KVM could've handled the request correctly by going to userspace and
performing I/O but there doesn't seem to be a good need for such requests
in the first place. Sane guests should not call VMXON/VMPTRLD/VMCLEAR with
anything but normal memory. Just inject #GP to find insane ones.

Reported-by: syzbot+2a7156e11dc199bdbd8a@syzkaller.appspotmail.com
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

Comments

Paolo Bonzini June 4, 2020, 2:40 p.m. UTC | #1
On 04/06/20 16:31, Vitaly Kuznetsov wrote:
> Syzbot reports the following issue:
> 
> WARNING: CPU: 0 PID: 6819 at arch/x86/kvm/x86.c:618 kvm_inject_emulated_page_fault+0x210/0x290 arch/x86/kvm/x86.c:618
> ...
> Call Trace:
> ...
> RIP: 0010:kvm_inject_emulated_page_fault+0x210/0x290 arch/x86/kvm/x86.c:618
> ...
>  nested_vmx_get_vmptr+0x1f9/0x2a0 arch/x86/kvm/vmx/nested.c:4638
>  handle_vmon arch/x86/kvm/vmx/nested.c:4767 [inline]
>  handle_vmon+0x168/0x3a0 arch/x86/kvm/vmx/nested.c:4728
>  vmx_handle_exit+0x29c/0x1260 arch/x86/kvm/vmx/vmx.c:6067
> 
> 'exception' we're trying to inject with kvm_inject_emulated_page_fault() comes from
>   nested_vmx_get_vmptr()
>    kvm_read_guest_virt()
>      kvm_read_guest_virt_helper()
>        vcpu->arch.walk_mmu->gva_to_gpa()
> 
> but it is only set when GVA to GPA conversion fails. In case it doesn't but
> we still fail kvm_vcpu_read_guest_page(), X86EMUL_IO_NEEDED is returned and
> nested_vmx_get_vmptr() calls kvm_inject_emulated_page_fault() with zeroed
> 'exception'. This happen when e.g. VMXON/VMPTRLD/VMCLEAR argument is MMIO.
> 
> KVM could've handled the request correctly by going to userspace and
> performing I/O but there doesn't seem to be a good need for such requests
> in the first place. Sane guests should not call VMXON/VMPTRLD/VMCLEAR with
> anything but normal memory. Just inject #GP to find insane ones.
> 
> Reported-by: syzbot+2a7156e11dc199bdbd8a@syzkaller.appspotmail.com
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 9c74a732b08d..05d57c3cb1ce 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4628,14 +4628,29 @@ static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer)
>  {
>  	gva_t gva;
>  	struct x86_exception e;
> +	int r;
>  
>  	if (get_vmx_mem_address(vcpu, vmx_get_exit_qual(vcpu),
>  				vmcs_read32(VMX_INSTRUCTION_INFO), false,
>  				sizeof(*vmpointer), &gva))
>  		return 1;
>  
> -	if (kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e)) {
> -		kvm_inject_emulated_page_fault(vcpu, &e);
> +	r = kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e);
> +	if (r != X86EMUL_CONTINUE) {
> +		if (r == X86EMUL_PROPAGATE_FAULT) {
> +			kvm_inject_emulated_page_fault(vcpu, &e);
> +		} else {
> +			/*
> +			 * X86EMUL_IO_NEEDED is returned when kvm_vcpu_read_guest_page()
> +			 * fails to read guest's memory (e.g. when 'gva' points to MMIO
> +			 * space). While KVM could've handled the request correctly by
> +			 * exiting to userspace and performing I/O, there doesn't seem
> +			 * to be a real use-case behind such requests, just inject #GP
> +			 * for now.
> +			 */
> +			kvm_inject_gp(vcpu, 0);
> +		}
> +
>  		return 1;
>  	}
>  
> 

Hi Vitaly,

looks good but we need to do the same in handle_vmread, handle_vmwrite,
handle_invept and handle_invvpid.  Which probably means adding something
like nested_inject_emulation_fault to commonize the inner "if".

Thanks,

Paolo
Vitaly Kuznetsov June 4, 2020, 2:53 p.m. UTC | #2
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 04/06/20 16:31, Vitaly Kuznetsov wrote:
>> Syzbot reports the following issue:
>> 
>> WARNING: CPU: 0 PID: 6819 at arch/x86/kvm/x86.c:618 kvm_inject_emulated_page_fault+0x210/0x290 arch/x86/kvm/x86.c:618
>> ...
>> Call Trace:
>> ...
>> RIP: 0010:kvm_inject_emulated_page_fault+0x210/0x290 arch/x86/kvm/x86.c:618
>> ...
>>  nested_vmx_get_vmptr+0x1f9/0x2a0 arch/x86/kvm/vmx/nested.c:4638
>>  handle_vmon arch/x86/kvm/vmx/nested.c:4767 [inline]
>>  handle_vmon+0x168/0x3a0 arch/x86/kvm/vmx/nested.c:4728
>>  vmx_handle_exit+0x29c/0x1260 arch/x86/kvm/vmx/vmx.c:6067
>> 
>> 'exception' we're trying to inject with kvm_inject_emulated_page_fault() comes from
>>   nested_vmx_get_vmptr()
>>    kvm_read_guest_virt()
>>      kvm_read_guest_virt_helper()
>>        vcpu->arch.walk_mmu->gva_to_gpa()
>> 
>> but it is only set when GVA to GPA conversion fails. In case it doesn't but
>> we still fail kvm_vcpu_read_guest_page(), X86EMUL_IO_NEEDED is returned and
>> nested_vmx_get_vmptr() calls kvm_inject_emulated_page_fault() with zeroed
>> 'exception'. This happen when e.g. VMXON/VMPTRLD/VMCLEAR argument is MMIO.
>> 
>> KVM could've handled the request correctly by going to userspace and
>> performing I/O but there doesn't seem to be a good need for such requests
>> in the first place. Sane guests should not call VMXON/VMPTRLD/VMCLEAR with
>> anything but normal memory. Just inject #GP to find insane ones.
>> 
>> Reported-by: syzbot+2a7156e11dc199bdbd8a@syzkaller.appspotmail.com
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/kvm/vmx/nested.c | 19 +++++++++++++++++--
>>  1 file changed, 17 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>> index 9c74a732b08d..05d57c3cb1ce 100644
>> --- a/arch/x86/kvm/vmx/nested.c
>> +++ b/arch/x86/kvm/vmx/nested.c
>> @@ -4628,14 +4628,29 @@ static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer)
>>  {
>>  	gva_t gva;
>>  	struct x86_exception e;
>> +	int r;
>>  
>>  	if (get_vmx_mem_address(vcpu, vmx_get_exit_qual(vcpu),
>>  				vmcs_read32(VMX_INSTRUCTION_INFO), false,
>>  				sizeof(*vmpointer), &gva))
>>  		return 1;
>>  
>> -	if (kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e)) {
>> -		kvm_inject_emulated_page_fault(vcpu, &e);
>> +	r = kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e);
>> +	if (r != X86EMUL_CONTINUE) {
>> +		if (r == X86EMUL_PROPAGATE_FAULT) {
>> +			kvm_inject_emulated_page_fault(vcpu, &e);
>> +		} else {
>> +			/*
>> +			 * X86EMUL_IO_NEEDED is returned when kvm_vcpu_read_guest_page()
>> +			 * fails to read guest's memory (e.g. when 'gva' points to MMIO
>> +			 * space). While KVM could've handled the request correctly by
>> +			 * exiting to userspace and performing I/O, there doesn't seem
>> +			 * to be a real use-case behind such requests, just inject #GP
>> +			 * for now.
>> +			 */
>> +			kvm_inject_gp(vcpu, 0);
>> +		}
>> +
>>  		return 1;
>>  	}
>>  
>> 
>
> Hi Vitaly,
>
> looks good but we need to do the same in handle_vmread, handle_vmwrite,
> handle_invept and handle_invvpid.  Which probably means adding something
> like nested_inject_emulation_fault to commonize the inner "if".
>

Oh true, I've only looked at nested_vmx_get_vmptr() users to fix the
immediate issue. Will do v2.
Sean Christopherson June 4, 2020, 2:53 p.m. UTC | #3
On Thu, Jun 04, 2020 at 04:40:52PM +0200, Paolo Bonzini wrote:
> On 04/06/20 16:31, Vitaly Kuznetsov wrote:

...

> > KVM could've handled the request correctly by going to userspace and
> > performing I/O but there doesn't seem to be a good need for such requests
> > in the first place. Sane guests should not call VMXON/VMPTRLD/VMCLEAR with
> > anything but normal memory. Just inject #GP to find insane ones.
> > 
> > Reported-by: syzbot+2a7156e11dc199bdbd8a@syzkaller.appspotmail.com
> > Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > ---
> >  arch/x86/kvm/vmx/nested.c | 19 +++++++++++++++++--
> >  1 file changed, 17 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 9c74a732b08d..05d57c3cb1ce 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -4628,14 +4628,29 @@ static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer)
> >  {
> >  	gva_t gva;
> >  	struct x86_exception e;
> > +	int r;
> >  
> >  	if (get_vmx_mem_address(vcpu, vmx_get_exit_qual(vcpu),
> >  				vmcs_read32(VMX_INSTRUCTION_INFO), false,
> >  				sizeof(*vmpointer), &gva))
> >  		return 1;
> >  
> > -	if (kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e)) {
> > -		kvm_inject_emulated_page_fault(vcpu, &e);
> > +	r = kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e);
> > +	if (r != X86EMUL_CONTINUE) {
> > +		if (r == X86EMUL_PROPAGATE_FAULT) {
> > +			kvm_inject_emulated_page_fault(vcpu, &e);
> > +		} else {
> > +			/*
> > +			 * X86EMUL_IO_NEEDED is returned when kvm_vcpu_read_guest_page()
> > +			 * fails to read guest's memory (e.g. when 'gva' points to MMIO
> > +			 * space). While KVM could've handled the request correctly by
> > +			 * exiting to userspace and performing I/O, there doesn't seem
> > +			 * to be a real use-case behind such requests, just inject #GP
> > +			 * for now.
> > +			 */
> > +			kvm_inject_gp(vcpu, 0);
> > +		}
> > +
> >  		return 1;
> >  	}
> >  
> > 
> 
> Hi Vitaly,
> 
> looks good but we need to do the same in handle_vmread, handle_vmwrite,
> handle_invept and handle_invvpid.  Which probably means adding something
> like nested_inject_emulation_fault to commonize the inner "if".

Can we just kill the guest already instead of throwing more hacks at this
and hoping something sticks?  We already have one in
kvm_write_guest_virt_system...

  commit 541ab2aeb28251bf7135c7961f3a6080eebcc705
  Author: Fuqian Huang <huangfq.daxian@gmail.com>
  Date:   Thu Sep 12 12:18:17 2019 +0800

    KVM: x86: work around leak of uninitialized stack contents

    Emulation of VMPTRST can incorrectly inject a page fault
    when passed an operand that points to an MMIO address.
    The page fault will use uninitialized kernel stack memory
    as the CR2 and error code.

    The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
    exit to userspace; however, it is not an easy fix, so for now just ensure
    that the error code and CR2 are zero.
Vitaly Kuznetsov June 4, 2020, 3:33 p.m. UTC | #4
Sean Christopherson <sean.j.christopherson@intel.com> writes:

> On Thu, Jun 04, 2020 at 04:40:52PM +0200, Paolo Bonzini wrote:
>> On 04/06/20 16:31, Vitaly Kuznetsov wrote:
>
> ...
>
>> > KVM could've handled the request correctly by going to userspace and
>> > performing I/O but there doesn't seem to be a good need for such requests
>> > in the first place. Sane guests should not call VMXON/VMPTRLD/VMCLEAR with
>> > anything but normal memory. Just inject #GP to find insane ones.
>> > 

...

>> 
>> looks good but we need to do the same in handle_vmread, handle_vmwrite,
>> handle_invept and handle_invvpid.  Which probably means adding something
>> like nested_inject_emulation_fault to commonize the inner "if".
>
> Can we just kill the guest already instead of throwing more hacks at this
> and hoping something sticks?  We already have one in
> kvm_write_guest_virt_system...
>
>   commit 541ab2aeb28251bf7135c7961f3a6080eebcc705
>   Author: Fuqian Huang <huangfq.daxian@gmail.com>
>   Date:   Thu Sep 12 12:18:17 2019 +0800
>
>     KVM: x86: work around leak of uninitialized stack contents
>

Oh I see...

[...]

Let's get back to 'vm_bugged' idea then? 

https://lore.kernel.org/kvm/87muadnn1t.fsf@vitty.brq.redhat.com/
Sean Christopherson June 4, 2020, 4:02 p.m. UTC | #5
On Thu, Jun 04, 2020 at 05:33:25PM +0200, Vitaly Kuznetsov wrote:
> Sean Christopherson <sean.j.christopherson@intel.com> writes:
> 
> > On Thu, Jun 04, 2020 at 04:40:52PM +0200, Paolo Bonzini wrote:
> >> On 04/06/20 16:31, Vitaly Kuznetsov wrote:
> >
> > ...
> >
> >> > KVM could've handled the request correctly by going to userspace and
> >> > performing I/O but there doesn't seem to be a good need for such requests
> >> > in the first place. Sane guests should not call VMXON/VMPTRLD/VMCLEAR with
> >> > anything but normal memory. Just inject #GP to find insane ones.
> >> > 
> 
> ...
> 
> >> 
> >> looks good but we need to do the same in handle_vmread, handle_vmwrite,
> >> handle_invept and handle_invvpid.  Which probably means adding something
> >> like nested_inject_emulation_fault to commonize the inner "if".
> >
> > Can we just kill the guest already instead of throwing more hacks at this
> > and hoping something sticks?  We already have one in
> > kvm_write_guest_virt_system...
> >
> >   commit 541ab2aeb28251bf7135c7961f3a6080eebcc705
> >   Author: Fuqian Huang <huangfq.daxian@gmail.com>
> >   Date:   Thu Sep 12 12:18:17 2019 +0800
> >
> >     KVM: x86: work around leak of uninitialized stack contents
> >
> 
> Oh I see...
> 
> [...]
> 
> Let's get back to 'vm_bugged' idea then? 
> 
> https://lore.kernel.org/kvm/87muadnn1t.fsf@vitty.brq.redhat.com/

Hmm, I don't think we need to go that far.  The 'vm_bugged' idea was more
to handle cases where KVM itself (or hardware) screwed something up and
detects an issue deep in a call stack with no recourse for reporting the
error up the stack.

That isn't the case here.  Unless I'm mistaken, the end result is simliar
to this patch, except that KVM would exit to userspace with
KVM_INTERNAL_ERROR_EMULATION instead of injecting a #GP.  E.g.

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 9c74a732b08d..e13d2c0014e2 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4624,6 +4624,20 @@ void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
        }
 }

+static int nested_vmx_handle_memory_failure(struct kvm_vcpu *vcpu, int ret,
+                                           struct x86_exception *e)
+{
+       if (r == X86EMUL_PROPAGATE_FAULT) {
+               kvm_inject_emulated_page_fault(vcpu, &e);
+               return 1;
+       }
+
+       vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+       vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
+       vcpu->run->internal.ndata = 0;
+       return 0;
+}
+
 static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer)
 {
        gva_t gva;
@@ -4634,11 +4648,9 @@ static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer)
                                sizeof(*vmpointer), &gva))
                return 1;

-       if (kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e)) {
-               kvm_inject_emulated_page_fault(vcpu, &e);
-               return 1;
-       }
-
+       r kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e);
+       if (r)
+               return nested_vmx_handle_memory_failure(r, &e);
        return 0;
 }



Side topic, I have some preliminary patches for the 'vm_bugged' idea.  I'll
try to whip them into something that can be posted upstream in the next few
weeks.
Paolo Bonzini June 4, 2020, 4:11 p.m. UTC | #6
On 04/06/20 18:02, Sean Christopherson wrote:
> On Thu, Jun 04, 2020 at 05:33:25PM +0200, Vitaly Kuznetsov wrote:
>> Sean Christopherson <sean.j.christopherson@intel.com> writes:
>>
>>> On Thu, Jun 04, 2020 at 04:40:52PM +0200, Paolo Bonzini wrote:
>>>> On 04/06/20 16:31, Vitaly Kuznetsov wrote:
>>>
>>> ...
>>>
>>>>> KVM could've handled the request correctly by going to userspace and
>>>>> performing I/O but there doesn't seem to be a good need for such requests
>>>>> in the first place. Sane guests should not call VMXON/VMPTRLD/VMCLEAR with
>>>>> anything but normal memory. Just inject #GP to find insane ones.
>>>>>
>>
>> ...
>>
>>>>
>>>> looks good but we need to do the same in handle_vmread, handle_vmwrite,
>>>> handle_invept and handle_invvpid.  Which probably means adding something
>>>> like nested_inject_emulation_fault to commonize the inner "if".
>>>
>>> Can we just kill the guest already instead of throwing more hacks at this
>>> and hoping something sticks?  We already have one in
>>> kvm_write_guest_virt_system...
>>>
>>>   commit 541ab2aeb28251bf7135c7961f3a6080eebcc705
>>>   Author: Fuqian Huang <huangfq.daxian@gmail.com>
>>>   Date:   Thu Sep 12 12:18:17 2019 +0800
>>>
>>>     KVM: x86: work around leak of uninitialized stack contents
>>>
>>
>> Oh I see...
>>
>> [...]
>>
>> Let's get back to 'vm_bugged' idea then? 
>>
>> https://lore.kernel.org/kvm/87muadnn1t.fsf@vitty.brq.redhat.com/
> 
> Hmm, I don't think we need to go that far.  The 'vm_bugged' idea was more
> to handle cases where KVM itself (or hardware) screwed something up and
> detects an issue deep in a call stack with no recourse for reporting the
> error up the stack.
> 
> That isn't the case here.  Unless I'm mistaken, the end result is simliar
> to this patch, except that KVM would exit to userspace with
> KVM_INTERNAL_ERROR_EMULATION instead of injecting a #GP.

Indeed, all these functions are very high on the call stack and what
Sean has scribbled below would apply to all cases.

Thanks,

Paolo

> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 9c74a732b08d..e13d2c0014e2 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4624,6 +4624,20 @@ void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
>         }
>  }
> 
> +static int nested_vmx_handle_memory_failure(struct kvm_vcpu *vcpu, int ret,
> +                                           struct x86_exception *e)
> +{
> +       if (r == X86EMUL_PROPAGATE_FAULT) {
> +               kvm_inject_emulated_page_fault(vcpu, &e);
> +               return 1;
> +       }
> +
> +       vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> +       vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
> +       vcpu->run->internal.ndata = 0;
> +       return 0;
> +}
> +
>  static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer)
>  {
>         gva_t gva;
> @@ -4634,11 +4648,9 @@ static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer)
>                                 sizeof(*vmpointer), &gva))
>                 return 1;
> 
> -       if (kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e)) {
> -               kvm_inject_emulated_page_fault(vcpu, &e);
> -               return 1;
> -       }
> -
> +       r kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e);
> +       if (r)
> +               return nested_vmx_handle_memory_failure(r, &e);
>         return 0;
>  }
> 
> 
> 
> Side topic, I have some preliminary patches for the 'vm_bugged' idea.  I'll
> try to whip them into something that can be posted upstream in the next few
> weeks.
>
Vitaly Kuznetsov June 4, 2020, 4:43 p.m. UTC | #7
Sean Christopherson <sean.j.christopherson@intel.com> writes:

> On Thu, Jun 04, 2020 at 05:33:25PM +0200, Vitaly Kuznetsov wrote:
>> Sean Christopherson <sean.j.christopherson@intel.com> writes:
>> 
>> > On Thu, Jun 04, 2020 at 04:40:52PM +0200, Paolo Bonzini wrote:
>> >> On 04/06/20 16:31, Vitaly Kuznetsov wrote:
>> >
>> > ...
>> >
>> >> > KVM could've handled the request correctly by going to userspace and
>> >> > performing I/O but there doesn't seem to be a good need for such requests
>> >> > in the first place. Sane guests should not call VMXON/VMPTRLD/VMCLEAR with
>> >> > anything but normal memory. Just inject #GP to find insane ones.
>> >> > 
>> 
>> ...
>> 
>> >> 
>> >> looks good but we need to do the same in handle_vmread, handle_vmwrite,
>> >> handle_invept and handle_invvpid.  Which probably means adding something
>> >> like nested_inject_emulation_fault to commonize the inner "if".
>> >
>> > Can we just kill the guest already instead of throwing more hacks at this
>> > and hoping something sticks?  We already have one in
>> > kvm_write_guest_virt_system...
>> >
>> >   commit 541ab2aeb28251bf7135c7961f3a6080eebcc705
>> >   Author: Fuqian Huang <huangfq.daxian@gmail.com>
>> >   Date:   Thu Sep 12 12:18:17 2019 +0800
>> >
>> >     KVM: x86: work around leak of uninitialized stack contents
>> >
>> 
>> Oh I see...
>> 
>> [...]
>> 
>> Let's get back to 'vm_bugged' idea then? 
>> 
>> https://lore.kernel.org/kvm/87muadnn1t.fsf@vitty.brq.redhat.com/
>
> Hmm, I don't think we need to go that far.  The 'vm_bugged' idea was more
> to handle cases where KVM itself (or hardware) screwed something up and
> detects an issue deep in a call stack with no recourse for reporting the
> error up the stack.
>
> That isn't the case here.  Unless I'm mistaken, the end result is simliar
> to this patch, except that KVM would exit to userspace with
> KVM_INTERNAL_ERROR_EMULATION instead of injecting a #GP.  E.g.

I just wanted to resurrect that 'vm_bugged' idea but was waiting for a
good opportunity :-)

The advantage of KVM_EXIT_INTERNAL_ERROR is that we're not trying to
invent some behavior which is not in SDM and making it a bit more likely
that we get a bug report from an angry user.

>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 9c74a732b08d..e13d2c0014e2 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4624,6 +4624,20 @@ void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
>         }
>  }
>
> +static int nested_vmx_handle_memory_failure(struct kvm_vcpu *vcpu, int ret,
> +                                           struct x86_exception *e)
> +{
> +       if (r == X86EMUL_PROPAGATE_FAULT) {
> +               kvm_inject_emulated_page_fault(vcpu, &e);
> +               return 1;
> +       }
> +
> +       vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> +       vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
> +       vcpu->run->internal.ndata = 0;
> +       return 0;
> +}
> +
>  static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer)
>  {
>         gva_t gva;
> @@ -4634,11 +4648,9 @@ static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer)
>                                 sizeof(*vmpointer), &gva))
>                 return 1;
>
> -       if (kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e)) {
> -               kvm_inject_emulated_page_fault(vcpu, &e);
> -               return 1;
> -       }
> -
> +       r kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e);
> +       if (r)
> +               return nested_vmx_handle_memory_failure(r, &e);
>         return 0;
>  }
>

... and the same for handle_vmread, handle_vmwrite, handle_invept and
handle_invvpid as suggested by Paolo. I'll be sending this as v2 with
your Suggested-by: shortly.

>
>
> Side topic, I have some preliminary patches for the 'vm_bugged' idea.  I'll
> try to whip them into something that can be posted upstream in the next few
> weeks.
>

Sounds great!
Jim Mattson June 4, 2020, 6:10 p.m. UTC | #8
On Thu, Jun 4, 2020 at 9:43 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Sean Christopherson <sean.j.christopherson@intel.com> writes:
>
> > On Thu, Jun 04, 2020 at 05:33:25PM +0200, Vitaly Kuznetsov wrote:
> >> Sean Christopherson <sean.j.christopherson@intel.com> writes:
> >>
> >> > On Thu, Jun 04, 2020 at 04:40:52PM +0200, Paolo Bonzini wrote:
> >> >> On 04/06/20 16:31, Vitaly Kuznetsov wrote:
> >> >
> >> > ...
> >> >
> >> >> > KVM could've handled the request correctly by going to userspace and
> >> >> > performing I/O but there doesn't seem to be a good need for such requests
> >> >> > in the first place. Sane guests should not call VMXON/VMPTRLD/VMCLEAR with
> >> >> > anything but normal memory. Just inject #GP to find insane ones.
> >> >> >
> >>
> >> ...
> >>
> >> >>
> >> >> looks good but we need to do the same in handle_vmread, handle_vmwrite,
> >> >> handle_invept and handle_invvpid.  Which probably means adding something
> >> >> like nested_inject_emulation_fault to commonize the inner "if".
> >> >
> >> > Can we just kill the guest already instead of throwing more hacks at this
> >> > and hoping something sticks?  We already have one in
> >> > kvm_write_guest_virt_system...
> >> >
> >> >   commit 541ab2aeb28251bf7135c7961f3a6080eebcc705
> >> >   Author: Fuqian Huang <huangfq.daxian@gmail.com>
> >> >   Date:   Thu Sep 12 12:18:17 2019 +0800
> >> >
> >> >     KVM: x86: work around leak of uninitialized stack contents
> >> >
> >>
> >> Oh I see...
> >>
> >> [...]
> >>
> >> Let's get back to 'vm_bugged' idea then?
> >>
> >> https://lore.kernel.org/kvm/87muadnn1t.fsf@vitty.brq.redhat.com/
> >
> > Hmm, I don't think we need to go that far.  The 'vm_bugged' idea was more
> > to handle cases where KVM itself (or hardware) screwed something up and
> > detects an issue deep in a call stack with no recourse for reporting the
> > error up the stack.
> >
> > That isn't the case here.  Unless I'm mistaken, the end result is simliar
> > to this patch, except that KVM would exit to userspace with
> > KVM_INTERNAL_ERROR_EMULATION instead of injecting a #GP.  E.g.
>
> I just wanted to resurrect that 'vm_bugged' idea but was waiting for a
> good opportunity :-)
>
> The advantage of KVM_EXIT_INTERNAL_ERROR is that we're not trying to
> invent some behavior which is not in SDM and making it a bit more likely
> that we get a bug report from an angry user.

If KVM can't handle the emulation, KVM_EXIT_INTERNAL_ERROR is far
better than cooking up fictional faults to deliver to the guest.

> >
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 9c74a732b08d..e13d2c0014e2 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -4624,6 +4624,20 @@ void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
> >         }
> >  }
> >
> > +static int nested_vmx_handle_memory_failure(struct kvm_vcpu *vcpu, int ret,
> > +                                           struct x86_exception *e)
> > +{
> > +       if (r == X86EMUL_PROPAGATE_FAULT) {
> > +               kvm_inject_emulated_page_fault(vcpu, &e);
> > +               return 1;
> > +       }
> > +
> > +       vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> > +       vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
> > +       vcpu->run->internal.ndata = 0;
> > +       return 0;
> > +}
> > +
> >  static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer)
> >  {
> >         gva_t gva;
> > @@ -4634,11 +4648,9 @@ static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer)
> >                                 sizeof(*vmpointer), &gva))
> >                 return 1;
> >
> > -       if (kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e)) {
> > -               kvm_inject_emulated_page_fault(vcpu, &e);
> > -               return 1;
> > -       }
> > -
> > +       r kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e);
> > +       if (r)
> > +               return nested_vmx_handle_memory_failure(r, &e);
> >         return 0;
> >  }
> >
>
> ... and the same for handle_vmread, handle_vmwrite, handle_invept and
> handle_invvpid as suggested by Paolo. I'll be sending this as v2 with
> your Suggested-by: shortly.
>
> >
> >
> > Side topic, I have some preliminary patches for the 'vm_bugged' idea.  I'll
> > try to whip them into something that can be posted upstream in the next few
> > weeks.
> >
>
> Sounds great!
>
> --
> Vitaly
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 9c74a732b08d..05d57c3cb1ce 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4628,14 +4628,29 @@  static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer)
 {
 	gva_t gva;
 	struct x86_exception e;
+	int r;
 
 	if (get_vmx_mem_address(vcpu, vmx_get_exit_qual(vcpu),
 				vmcs_read32(VMX_INSTRUCTION_INFO), false,
 				sizeof(*vmpointer), &gva))
 		return 1;
 
-	if (kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e)) {
-		kvm_inject_emulated_page_fault(vcpu, &e);
+	r = kvm_read_guest_virt(vcpu, gva, vmpointer, sizeof(*vmpointer), &e);
+	if (r != X86EMUL_CONTINUE) {
+		if (r == X86EMUL_PROPAGATE_FAULT) {
+			kvm_inject_emulated_page_fault(vcpu, &e);
+		} else {
+			/*
+			 * X86EMUL_IO_NEEDED is returned when kvm_vcpu_read_guest_page()
+			 * fails to read guest's memory (e.g. when 'gva' points to MMIO
+			 * space). While KVM could've handled the request correctly by
+			 * exiting to userspace and performing I/O, there doesn't seem
+			 * to be a real use-case behind such requests, just inject #GP
+			 * for now.
+			 */
+			kvm_inject_gp(vcpu, 0);
+		}
+
 		return 1;
 	}