diff mbox series

KVM: x86: work around leak of uninitialized stack contents

Message ID 20190912041817.23984-1-huangfq.daxian@gmail.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: work around leak of uninitialized stack contents | expand

Commit Message

Fuqian Huang Sept. 12, 2019, 4:18 a.m. UTC
Emulation of VMPTRST can incorrectly inject a page fault
when passed an operand that points to an MMIO address.
The page fault will use uninitialized kernel stack memory
as the CR2 and error code.

The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
exit to userspace; however, it is not an easy fix, so for now just ensure
that the error code and CR2 are zero.

Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
---
 arch/x86/kvm/x86.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Vitaly Kuznetsov Sept. 12, 2019, 8:51 a.m. UTC | #1
Fuqian Huang <huangfq.daxian@gmail.com> writes:

> Emulation of VMPTRST can incorrectly inject a page fault
> when passed an operand that points to an MMIO address.
> The page fault will use uninitialized kernel stack memory
> as the CR2 and error code.
>
> The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
> exit to userspace;

Hm, why so? KVM_EXIT_INTERNAL_ERROR is basically an error in KVM, this
is not a proper reaction to a userspace-induced condition (or ever).

I also looked at VMPTRST's description in Intel's manual and I can't
find and explicit limitation like "this must be normal memory". We're
just supposed to inject #PF "If a page fault occurs in accessing the
memory destination operand."

In case it seems to be too cumbersome to handle VMPTRST to MMIO and we
think that nobody should be doing that I'd rather prefer injecting #GP.

Please tell me what I'm missing :-)

>  however, it is not an easy fix, so for now just ensure
> that the error code and CR2 are zero.
>
> Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
> ---
>  arch/x86/kvm/x86.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 290c3c3efb87..7f442d710858 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5312,6 +5312,7 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
>  	/* kvm_write_guest_virt_system can pull in tons of pages. */
>  	vcpu->arch.l1tf_flush_l1d = true;
>  
> +	memset(exception, 0, sizeof(*exception));
>  	return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
>  					   PFERR_WRITE_MASK, exception);
>  }
Fuqian Huang Sept. 12, 2019, 8:56 a.m. UTC | #2
Vitaly Kuznetsov <vkuznets@redhat.com> 於 2019年9月12日週四 下午4:51寫道:
>
> Fuqian Huang <huangfq.daxian@gmail.com> writes:
>
> > Emulation of VMPTRST can incorrectly inject a page fault
> > when passed an operand that points to an MMIO address.
> > The page fault will use uninitialized kernel stack memory
> > as the CR2 and error code.
> >
> > The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
> > exit to userspace;
>
> Hm, why so? KVM_EXIT_INTERNAL_ERROR is basically an error in KVM, this
> is not a proper reaction to a userspace-induced condition (or ever).
>
> I also looked at VMPTRST's description in Intel's manual and I can't
> find and explicit limitation like "this must be normal memory". We're
> just supposed to inject #PF "If a page fault occurs in accessing the
> memory destination operand."
>
> In case it seems to be too cumbersome to handle VMPTRST to MMIO and we
> think that nobody should be doing that I'd rather prefer injecting #GP.
>
> Please tell me what I'm missing :-)

I found it during the code review, and it looks like the problem the
commit 353c0956a618 ("KVM: x86: work around leak of uninitialized
stack contents (CVE-2019-7222)")
mentions. So I fixed it in a similar way.

>
> >  however, it is not an easy fix, so for now just ensure
> > that the error code and CR2 are zero.
> >
> > Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
> > ---
> >  arch/x86/kvm/x86.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 290c3c3efb87..7f442d710858 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5312,6 +5312,7 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
> >       /* kvm_write_guest_virt_system can pull in tons of pages. */
> >       vcpu->arch.l1tf_flush_l1d = true;
> >
> > +     memset(exception, 0, sizeof(*exception));
> >       return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
> >                                          PFERR_WRITE_MASK, exception);
> >  }
>
> --
> Vitaly
Vitaly Kuznetsov Sept. 12, 2019, 10:53 a.m. UTC | #3
Fuqian Huang <huangfq.daxian@gmail.com> writes:

> Vitaly Kuznetsov <vkuznets@redhat.com> 於 2019年9月12日週四 下午4:51寫道:
>>
>> Fuqian Huang <huangfq.daxian@gmail.com> writes:
>>
>> > Emulation of VMPTRST can incorrectly inject a page fault
>> > when passed an operand that points to an MMIO address.
>> > The page fault will use uninitialized kernel stack memory
>> > as the CR2 and error code.
>> >
>> > The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
>> > exit to userspace;
>>
>> Hm, why so? KVM_EXIT_INTERNAL_ERROR is basically an error in KVM, this
>> is not a proper reaction to a userspace-induced condition (or ever).
>>
>> I also looked at VMPTRST's description in Intel's manual and I can't
>> find and explicit limitation like "this must be normal memory". We're
>> just supposed to inject #PF "If a page fault occurs in accessing the
>> memory destination operand."
>>
>> In case it seems to be too cumbersome to handle VMPTRST to MMIO and we
>> think that nobody should be doing that I'd rather prefer injecting #GP.
>>
>> Please tell me what I'm missing :-)
>
> I found it during the code review, and it looks like the problem the
> commit 353c0956a618 ("KVM: x86: work around leak of uninitialized
> stack contents (CVE-2019-7222)")
> mentions. So I fixed it in a similar way.
>

Oh, yes, I'm not against the fix at all, I was just wondering about why
you think we need to kill the guest in this case.
Fuqian Huang Sept. 12, 2019, 12:02 p.m. UTC | #4
Vitaly Kuznetsov <vkuznets@redhat.com> 於 2019年9月12日週四 下午6:53寫道:
>
> Fuqian Huang <huangfq.daxian@gmail.com> writes:
>
> > Vitaly Kuznetsov <vkuznets@redhat.com> 於 2019年9月12日週四 下午4:51寫道:
> >>
> >> Fuqian Huang <huangfq.daxian@gmail.com> writes:
> >>
> >> > Emulation of VMPTRST can incorrectly inject a page fault
> >> > when passed an operand that points to an MMIO address.
> >> > The page fault will use uninitialized kernel stack memory
> >> > as the CR2 and error code.
> >> >
> >> > The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
> >> > exit to userspace;
> >>
> >> Hm, why so? KVM_EXIT_INTERNAL_ERROR is basically an error in KVM, this
> >> is not a proper reaction to a userspace-induced condition (or ever).
> >>
> >> I also looked at VMPTRST's description in Intel's manual and I can't
> >> find and explicit limitation like "this must be normal memory". We're
> >> just supposed to inject #PF "If a page fault occurs in accessing the
> >> memory destination operand."
> >>
> >> In case it seems to be too cumbersome to handle VMPTRST to MMIO and we
> >> think that nobody should be doing that I'd rather prefer injecting #GP.
> >>
> >> Please tell me what I'm missing :-)
> >
> > I found it during the code review, and it looks like the problem the
> > commit 353c0956a618 ("KVM: x86: work around leak of uninitialized
> > stack contents (CVE-2019-7222)")
> > mentions. So I fixed it in a similar way.
> >
>
> Oh, yes, I'm not against the fix at all, I was just wondering about why
> you think we need to kill the guest in this case.

I don't know what is the proper way to handle this case, I just initialize the
memory to avoid information leakage.
(Actually, I am not an expert on KVM's code)
I will be appreciated if the bug is fixed. :)

>
> --
> Vitaly
Jim Mattson Sept. 12, 2019, 4:20 p.m. UTC | #5
On Thu, Sep 12, 2019 at 1:51 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Fuqian Huang <huangfq.daxian@gmail.com> writes:
>
> > Emulation of VMPTRST can incorrectly inject a page fault
> > when passed an operand that points to an MMIO address.
> > The page fault will use uninitialized kernel stack memory
> > as the CR2 and error code.
> >
> > The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
> > exit to userspace;
>
> Hm, why so? KVM_EXIT_INTERNAL_ERROR is basically an error in KVM, this
> is not a proper reaction to a userspace-induced condition (or ever).

This *is* an error in KVM. KVM should properly emulate the quadword
store to the emulated device. Doing anything else is just wrong.

KVM_INTERNAL_ERROR is basically a cop-out for things that are hard.

> I also looked at VMPTRST's description in Intel's manual and I can't
> find and explicit limitation like "this must be normal memory". We're
> just supposed to inject #PF "If a page fault occurs in accessing the
> memory destination operand."
>
> In case it seems to be too cumbersome to handle VMPTRST to MMIO and we
> think that nobody should be doing that I'd rather prefer injecting #GP.

That is not the architected behavior at all. Now you're just making things up!

> Please tell me what I'm missing :-)
>
> >  however, it is not an easy fix, so for now just ensure
> > that the error code and CR2 are zero.
> >
> > Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
> > ---
> >  arch/x86/kvm/x86.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 290c3c3efb87..7f442d710858 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5312,6 +5312,7 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
> >       /* kvm_write_guest_virt_system can pull in tons of pages. */
> >       vcpu->arch.l1tf_flush_l1d = true;
> >
> > +     memset(exception, 0, sizeof(*exception));
> >       return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
> >                                          PFERR_WRITE_MASK, exception);
> >  }
>
> --
> Vitaly
Vitaly Kuznetsov Sept. 12, 2019, 4:44 p.m. UTC | #6
Jim Mattson <jmattson@google.com> writes:

> On Thu, Sep 12, 2019 at 1:51 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>
>> Fuqian Huang <huangfq.daxian@gmail.com> writes:
>>
>> > Emulation of VMPTRST can incorrectly inject a page fault
>> > when passed an operand that points to an MMIO address.
>> > The page fault will use uninitialized kernel stack memory
>> > as the CR2 and error code.
>> >
>> > The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
>> > exit to userspace;
>>
>> Hm, why so? KVM_EXIT_INTERNAL_ERROR is basically an error in KVM, this
>> is not a proper reaction to a userspace-induced condition (or ever).
>
> This *is* an error in KVM. KVM should properly emulate the quadword
> store to the emulated device. Doing anything else is just wrong.
>
> KVM_INTERNAL_ERROR is basically a cop-out for things that are hard.

Yes, I way arguing with "the right behavior would be" in relation to
KVM_INTERNAL_ERROR.

>
>> I also looked at VMPTRST's description in Intel's manual and I can't
>> find and explicit limitation like "this must be normal memory". We're
>> just supposed to inject #PF "If a page fault occurs in accessing the
>> memory destination operand."
>>
>> In case it seems to be too cumbersome to handle VMPTRST to MMIO and we
>> think that nobody should be doing that I'd rather prefer injecting #GP.
>
> That is not the architected behavior at all. Now you're just making
> things up!

True and I'm not against KVM_INTERNAL_ERROR as an iterim solution if it
comes with a comment explaining why we're 'admitting defeat' here.
Jim Mattson Sept. 12, 2019, 9:20 p.m. UTC | #7
On Wed, Sep 11, 2019 at 9:18 PM Fuqian Huang <huangfq.daxian@gmail.com> wrote:
>
> Emulation of VMPTRST can incorrectly inject a page fault
> when passed an operand that points to an MMIO address.
> The page fault will use uninitialized kernel stack memory
> as the CR2 and error code.
>
> The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
> exit to userspace; however, it is not an easy fix, so for now just ensure
> that the error code and CR2 are zero.
>
> Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
> ---
>  arch/x86/kvm/x86.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 290c3c3efb87..7f442d710858 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5312,6 +5312,7 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
>         /* kvm_write_guest_virt_system can pull in tons of pages. */
>         vcpu->arch.l1tf_flush_l1d = true;
>
> +       memset(exception, 0, sizeof(*exception));
>         return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
>                                            PFERR_WRITE_MASK, exception);
>  }
> --
> 2.11.0
>
Perhaps you could also add a comment like the one Paolo added when he
made the same change in kvm_read_guest_virt?
See commit 353c0956a618 ("KVM: x86: work around leak of uninitialized
stack contents (CVE-2019-7222)").
Sean Christopherson Sept. 12, 2019, 11:52 p.m. UTC | #8
On Thu, Sep 12, 2019 at 02:20:09PM -0700, Jim Mattson wrote:
> On Wed, Sep 11, 2019 at 9:18 PM Fuqian Huang <huangfq.daxian@gmail.com> wrote:
> >
> > Emulation of VMPTRST can incorrectly inject a page fault
> > when passed an operand that points to an MMIO address.
> > The page fault will use uninitialized kernel stack memory
> > as the CR2 and error code.
> >
> > The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
> > exit to userspace; however, it is not an easy fix, so for now just ensure
> > that the error code and CR2 are zero.
> >
> > Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
> > ---
> >  arch/x86/kvm/x86.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 290c3c3efb87..7f442d710858 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5312,6 +5312,7 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
> >         /* kvm_write_guest_virt_system can pull in tons of pages. */
> >         vcpu->arch.l1tf_flush_l1d = true;
> >
> > +       memset(exception, 0, sizeof(*exception));
> >         return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
> >                                            PFERR_WRITE_MASK, exception);
> >  }
> > --
> > 2.11.0
> >
> Perhaps you could also add a comment like the one Paolo added when he
> made the same change in kvm_read_guest_virt?
> See commit 353c0956a618 ("KVM: x86: work around leak of uninitialized
> stack contents (CVE-2019-7222)").

I have a better hack-a-fix, we can handle the unexpected MMIO using master
abort semantics, i.e. reads return all ones, writes are dropped.  It's not
100% correct as KVM won't handle the case where the address is legit MMIO,
but it's at least sometimes correct and thus better than a #PF.

Patch and a unit test incoming...
Paolo Bonzini Sept. 13, 2019, 9:07 a.m. UTC | #9
On 13/09/19 01:52, Sean Christopherson wrote:
>>>
>> Perhaps you could also add a comment like the one Paolo added when he
>> made the same change in kvm_read_guest_virt?
>> See commit 353c0956a618 ("KVM: x86: work around leak of uninitialized
>> stack contents (CVE-2019-7222)").
> I have a better hack-a-fix, we can handle the unexpected MMIO using master
> abort semantics, i.e. reads return all ones, writes are dropped.  It's not
> 100% correct as KVM won't handle the case where the address is legit MMIO,
> but it's at least sometimes correct and thus better than a #PF.

That's still hacky though.  I agree with Jim that
KVM_EXIT_INTERNAL_ERROR is basically "math is hard, let's go shopping"
but it's better than making up our own behavior (of either the chipset
or the processor).

I'll add the comment and commit Fuqiang's patch.

Paolo
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 290c3c3efb87..7f442d710858 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5312,6 +5312,7 @@  int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
 	/* kvm_write_guest_virt_system can pull in tons of pages. */
 	vcpu->arch.l1tf_flush_l1d = true;
 
+	memset(exception, 0, sizeof(*exception));
 	return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
 					   PFERR_WRITE_MASK, exception);
 }