diff mbox

KVM: X86: Fix SMRAM accessing even if VM is shutdown

Message ID 1517984706-47244-1-git-send-email-wanpengli@tencent.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wanpeng Li Feb. 7, 2018, 6:25 a.m. UTC
From: Wanpeng Li <wanpengli@tencent.com>

Reported by syzkaller:

   WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
   CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
   RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
   Call Trace:
    vmx_handle_exit+0xbd/0xe20 [kvm_intel]
    kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
    kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
    do_vfs_ioctl+0xa4/0x6a0
    SyS_ioctl+0x79/0x90
    entry_SYSCALL_64_fastpath+0x25/0x9c

The syzkaller creates a former thread to issue KVM_SMI ioctl, and then creates
a latter thread to mmap and operate on the same vCPU, rsm emulation will not be 
executed since there is no something like seabios which implements smi handler 
when running syzkaller directly. This triggers a race condition when running 
the testcase with multiple threads. Sometimes one thread exit w/ SHUTDOWN 
reason, another thread mmaps and operates on the same vCPU, it continues to 
use CS=0x30000, IP=0x8000 to access the address of SMI handler which results 
in the above ept misconfig. This patch fixes it by bailing out immediately if 
the vCPU is marked EXIT_SHUTDOWN reason.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kvm/x86.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Dmitry Vyukov Feb. 7, 2018, 6:41 a.m. UTC | #1
On Wed, Feb 7, 2018 at 7:25 AM, Wanpeng Li <kernellwp@gmail.com> wrote:
> From: Wanpeng Li <wanpengli@tencent.com>
>
> Reported by syzkaller:
>
>    WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
>    CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
>    RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
>    Call Trace:
>     vmx_handle_exit+0xbd/0xe20 [kvm_intel]
>     kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
>     kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
>     do_vfs_ioctl+0xa4/0x6a0
>     SyS_ioctl+0x79/0x90
>     entry_SYSCALL_64_fastpath+0x25/0x9c
>
> The syzkaller creates a former thread to issue KVM_SMI ioctl, and then creates
> a latter thread to mmap and operate on the same vCPU, rsm emulation will not be
> executed since there is no something like seabios which implements smi handler
> when running syzkaller directly. This triggers a race condition when running
> the testcase with multiple threads. Sometimes one thread exit w/ SHUTDOWN
> reason, another thread mmaps and operates on the same vCPU, it continues to
> use CS=0x30000, IP=0x8000 to access the address of SMI handler which results
> in the above ept misconfig. This patch fixes it by bailing out immediately if
> the vCPU is marked EXIT_SHUTDOWN reason.
>
> Reported-by: Dmitry Vyukov <dvyukov@google.com>

This was reported by syzbot:
https://groups.google.com/d/msg/syzkaller-bugs/6GrlY0UcDEk/aMShRKq3AwAJ

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed.


> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
>  arch/x86/kvm/x86.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 786cd00..445e702 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7458,6 +7458,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
>                 goto out;
>         }
>
> +       if (unlikely(vcpu->run->exit_reason == KVM_EXIT_SHUTDOWN)) {
> +               r = -EINVAL;
> +               goto out;
> +       }
> +
>         if (vcpu->run->kvm_dirty_regs) {
>                 r = sync_regs(vcpu);
>                 if (r != 0)
> --
> 2.7.4
>
Paolo Bonzini Feb. 7, 2018, 2:16 p.m. UTC | #2
On 07/02/2018 07:25, Wanpeng Li wrote:
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 786cd00..445e702 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7458,6 +7458,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
>  		goto out;
>  	}
>  
> +	if (unlikely(vcpu->run->exit_reason == KVM_EXIT_SHUTDOWN)) {
> +		r = -EINVAL;
> +		goto out;
> +	}
> +
>  	if (vcpu->run->kvm_dirty_regs) {
>  		r = sync_regs(vcpu);
>  		if (r != 0)
> 

This most likely breaks triple faults in the usual case where they
should result in resetting the system; the KVM API doesn't say that you
should clear vcpu->run->exit_reason before entering.

What exactly causes the EPT misconfig to reach the WARN?  That is, how
does kvm_mmu_page_fault end up returning a negative errno value?  If I
read the code correctly only tdp_page_fault can do so, so my guess would
be kvm_handle_bad_page:

        if (pfn == KVM_PFN_ERR_RO_FAULT)
                return RET_PF_EMULATE;

        if (pfn == KVM_PFN_ERR_HWPOISON) {
                kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
current);
                return RET_PF_RETRY;
        }

	/* KVM_PFN_ERR_FAULT */
        return -EFAULT;

Maybe it should return RET_PF_EMULATE, which would cause an emulation
failure and then an exit with KVM_EXIT_INTERNAL_ERROR.

Thanks,

Paolo
Wanpeng Li Feb. 8, 2018, 7:35 a.m. UTC | #3
2018-02-07 22:16 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
> On 07/02/2018 07:25, Wanpeng Li wrote:
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 786cd00..445e702 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -7458,6 +7458,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
>>               goto out;
>>       }
>>
>> +     if (unlikely(vcpu->run->exit_reason == KVM_EXIT_SHUTDOWN)) {
>> +             r = -EINVAL;
>> +             goto out;
>> +     }
>> +
>>       if (vcpu->run->kvm_dirty_regs) {
>>               r = sync_regs(vcpu);
>>               if (r != 0)
>>
>
> This most likely breaks triple faults in the usual case where they
> should result in resetting the system; the KVM API doesn't say that you
> should clear vcpu->run->exit_reason before entering.
>
> What exactly causes the EPT misconfig to reach the WARN?  That is, how
> does kvm_mmu_page_fault end up returning a negative errno value?  If I
> read the code correctly only tdp_page_fault can do so, so my guess would
> be kvm_handle_bad_page:
>
>         if (pfn == KVM_PFN_ERR_RO_FAULT)
>                 return RET_PF_EMULATE;
>
>         if (pfn == KVM_PFN_ERR_HWPOISON) {
>                 kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
> current);
>                 return RET_PF_RETRY;
>         }
>
>         /* KVM_PFN_ERR_FAULT */
>         return -EFAULT;
>
> Maybe it should return RET_PF_EMULATE, which would cause an emulation
> failure and then an exit with KVM_EXIT_INTERNAL_ERROR.

Agreed, just do it in v2. In addition, I didn't remove the
RET_PFN_ERR_PO_FAULT check since I think otherwise we will miss the
comments above it.

Regards,
Wanpeng Li
Xiao Guangrong Feb. 8, 2018, 8:57 a.m. UTC | #4
On 02/07/2018 10:16 PM, Paolo Bonzini wrote:
> On 07/02/2018 07:25, Wanpeng Li wrote:
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 786cd00..445e702 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -7458,6 +7458,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
>>   		goto out;
>>   	}
>>   
>> +	if (unlikely(vcpu->run->exit_reason == KVM_EXIT_SHUTDOWN)) {
>> +		r = -EINVAL;
>> +		goto out;
>> +	}
>> +
>>   	if (vcpu->run->kvm_dirty_regs) {
>>   		r = sync_regs(vcpu);
>>   		if (r != 0)
>>
> 
> This most likely breaks triple faults in the usual case where they
> should result in resetting the system; the KVM API doesn't say that you
> should clear vcpu->run->exit_reason before entering.
> 
> What exactly causes the EPT misconfig to reach the WARN?  That is, how
> does kvm_mmu_page_fault end up returning a negative errno value?  If I
> read the code correctly only tdp_page_fault can do so, so my guess would
> be kvm_handle_bad_page:
> 
>          if (pfn == KVM_PFN_ERR_RO_FAULT)
>                  return RET_PF_EMULATE;
> 
>          if (pfn == KVM_PFN_ERR_HWPOISON) {
>                  kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
> current);
>                  return RET_PF_RETRY;
>          }
> 
> 	/* KVM_PFN_ERR_FAULT */
>          return -EFAULT;
> 
> Maybe it should return RET_PF_EMULATE, which would cause an emulation
> failure and then an exit with KVM_EXIT_INTERNAL_ERROR.

So the root cause is that a running vCPU accessing the memory whose memslot
is being updated (met the condition KVM_MEMSLOT_INVALID is set on the its
memslot).

The normal #PF handler breaks KVM_RUN and returns -EFAULT to userspace, we'd
better to make ept-misconfig's handler follow this style as well.

Actually, the WARN_ON in ept-misconfig's handler is unnecessary as
kvm_mmu_page_fault() will warn us if it is the real ept misconfig, so we can
simply return kvm_mmu_page_fault().
Paolo Bonzini Feb. 8, 2018, 10:31 a.m. UTC | #5
On 08/02/2018 09:57, Xiao Guangrong wrote:
>> Maybe it should return RET_PF_EMULATE, which would cause an emulation
>> failure and then an exit with KVM_EXIT_INTERNAL_ERROR.
> 
> So the root cause is that a running vCPU accessing the memory whose memslot
> is being updated (met the condition KVM_MEMSLOT_INVALID is set on the its
> memslot).
> 
> The normal #PF handler breaks KVM_RUN and returns -EFAULT to userspace,
> we'd better to make ept-misconfig's handler follow this style as well.

Why return -EFAULT and not attempt emulation (which will fail)?

Paolo
Xiao Guangrong Feb. 9, 2018, 3:22 a.m. UTC | #6
On 02/08/2018 06:31 PM, Paolo Bonzini wrote:
> On 08/02/2018 09:57, Xiao Guangrong wrote:
>>> Maybe it should return RET_PF_EMULATE, which would cause an emulation
>>> failure and then an exit with KVM_EXIT_INTERNAL_ERROR.
>>
>> So the root cause is that a running vCPU accessing the memory whose memslot
>> is being updated (met the condition KVM_MEMSLOT_INVALID is set on the its
>> memslot).
>>
>> The normal #PF handler breaks KVM_RUN and returns -EFAULT to userspace,
>> we'd better to make ept-misconfig's handler follow this style as well.
> 
> Why return -EFAULT and not attempt emulation (which will fail)?
> 

That is a good question... :)

This case (with KVM_MEMSLOT_INVALID is set) can be easily constructed,
userspace should avoid this case by itself (avoiding vCPU accessing the
memslot which is being updated). If it happens, it's a operation issue
rather than INTERNAL ERROR.

Maybe treat it as MMIO accessing and return to userspace with MMIO_EXIT
is a better solution...
Paolo Bonzini Feb. 9, 2018, 12:42 p.m. UTC | #7
On 09/02/2018 04:22, Xiao Guangrong wrote:
>>
> 
> That is a good question... :)
> 
> This case (with KVM_MEMSLOT_INVALID is set) can be easily constructed,
> userspace should avoid this case by itself (avoiding vCPU accessing the
> memslot which is being updated). If it happens, it's a operation issue
> rather than INTERNAL ERROR.
> 
> Maybe treat it as MMIO accessing and return to userspace with MMIO_EXIT
> is a better solution...

Yeah, that's what emulation would do (except if it's an instruction
fetch, which will cause emulation to fail).  I think it's a bug in the
non-EPT #PF case that we return with -EFAULT.

Paolo
Xiao Guangrong Feb. 11, 2018, 3:20 a.m. UTC | #8
On 02/09/2018 08:42 PM, Paolo Bonzini wrote:
> On 09/02/2018 04:22, Xiao Guangrong wrote:
>>>
>>
>> That is a good question... :)
>>
>> This case (with KVM_MEMSLOT_INVALID is set) can be easily constructed,
>> userspace should avoid this case by itself (avoiding vCPU accessing the
>> memslot which is being updated). If it happens, it's a operation issue
>> rather than INTERNAL ERROR.
>>
>> Maybe treat it as MMIO accessing and return to userspace with MMIO_EXIT
>> is a better solution...
> 
> Yeah, that's what emulation would do (except if it's an instruction
> fetch, which will cause emulation to fail).  I think it's a bug in the
> non-EPT #PF case that we return with -EFAULT.

Wanpeng, could you please do it? :)
Wanpeng Li Feb. 11, 2018, 8:56 a.m. UTC | #9
2018-02-11 11:20 GMT+08:00 Xiao Guangrong <guangrong.xiao@gmail.com>:
>
>
> On 02/09/2018 08:42 PM, Paolo Bonzini wrote:
>>
>> On 09/02/2018 04:22, Xiao Guangrong wrote:
>>>>
>>>>
>>>
>>> That is a good question... :)
>>>
>>> This case (with KVM_MEMSLOT_INVALID is set) can be easily constructed,
>>> userspace should avoid this case by itself (avoiding vCPU accessing the
>>> memslot which is being updated). If it happens, it's a operation issue
>>> rather than INTERNAL ERROR.
>>>
>>> Maybe treat it as MMIO accessing and return to userspace with MMIO_EXIT
>>> is a better solution...
>>
>>
>> Yeah, that's what emulation would do (except if it's an instruction
>> fetch, which will cause emulation to fail).  I think it's a bug in the
>> non-EPT #PF case that we return with -EFAULT.
>
>
> Wanpeng, could you please do it? :)

Thanks for the discussion, I will have a try. :)

Regards,
Wanpeng Li
Wanpeng Li Feb. 11, 2018, 10:57 a.m. UTC | #10
2018-02-09 20:42 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
> On 09/02/2018 04:22, Xiao Guangrong wrote:
>>>
>>
>> That is a good question... :)
>>
>> This case (with KVM_MEMSLOT_INVALID is set) can be easily constructed,
>> userspace should avoid this case by itself (avoiding vCPU accessing the
>> memslot which is being updated). If it happens, it's a operation issue
>> rather than INTERNAL ERROR.
>>
>> Maybe treat it as MMIO accessing and return to userspace with MMIO_EXIT
>> is a better solution...
>
> Yeah, that's what emulation would do (except if it's an instruction
> fetch, which will cause emulation to fail).  I think it's a bug in the

After patch v2, I found that instruction fetch causes emulation to
fail since KVM_MEMSLOT_INVALID.

Regards,
Wanpeng Li
diff mbox

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 786cd00..445e702 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7458,6 +7458,11 @@  int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		goto out;
 	}
 
+	if (unlikely(vcpu->run->exit_reason == KVM_EXIT_SHUTDOWN)) {
+		r = -EINVAL;
+		goto out;
+	}
+
 	if (vcpu->run->kvm_dirty_regs) {
 		r = sync_regs(vcpu);
 		if (r != 0)