diff mbox

KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state

Message ID 5129361A.7090608@web.de (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Kiszka Feb. 23, 2013, 9:35 p.m. UTC
From: Jan Kiszka <jan.kiszka@siemens.com>

Likely a typo, but a fatal one as kvm_set_cr0 performs checks on the
state transition that may prevent loading L1's cr0.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/x86/kvm/vmx.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

Nadav Har'El Feb. 23, 2013, 9:45 p.m. UTC | #1
On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state":
> -	kvm_set_cr0(vcpu, vmcs12->host_cr0);
> +	vmx_set_cr0(vcpu, vmcs12->host_cr0);

I don't remember now why I did this (and I'm not looking at the code),
but this you'll need to really test carefully, including
shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any
important side-effect of kvm_set_cr0.

Also, if I remember correctly, during nVMX's review, Avi Kivity asked
in several places that when I called vmx_set_cr0, I should instead call
kvm_set_cr0(), because it does some extra stuff and does some extra
checks. Hmm, see, see this:
	http://markmail.org/message/hhidqyhbo2mrgxxc

where Avi asked for the reverse patch you're attempting now.
Jan Kiszka Feb. 23, 2013, 9:57 p.m. UTC | #2
On 2013-02-23 22:45, Nadav Har'El wrote:
> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state":
>> -	kvm_set_cr0(vcpu, vmcs12->host_cr0);
>> +	vmx_set_cr0(vcpu, vmcs12->host_cr0);
> 
> I don't remember now why I did this (and I'm not looking at the code),
> but this you'll need to really test carefully, including
> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any
> important side-effect of kvm_set_cr0.
> 
> Also, if I remember correctly, during nVMX's review, Avi Kivity asked
> in several places that when I called vmx_set_cr0, I should instead call
> kvm_set_cr0(), because it does some extra stuff and does some extra
> checks. Hmm, see, see this:
> 	http://markmail.org/message/hhidqyhbo2mrgxxc
> 
> where Avi asked for the reverse patch you're attempting now.

At least, kvm_set_cr0 can't be used as it assumes an otherwise
consistent guest state and an explicitly initiated transition - which is
naturally not the case while emulating a vmexit.

However, there are some side effects that need a closer look, likely
also in other places where vmx_set_cr0 is used directly. E.g.
load_pdptrs. The lack of calling them on host->guest switch may explain
the PAE issues you once saw with your nEPT implementation.

OK, more work...

Jan
Jan Kiszka Feb. 23, 2013, 10:21 p.m. UTC | #3
On 2013-02-23 22:57, Jan Kiszka wrote:
> On 2013-02-23 22:45, Nadav Har'El wrote:
>> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state":
>>> -	kvm_set_cr0(vcpu, vmcs12->host_cr0);
>>> +	vmx_set_cr0(vcpu, vmcs12->host_cr0);
>>
>> I don't remember now why I did this (and I'm not looking at the code),
>> but this you'll need to really test carefully, including
>> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any
>> important side-effect of kvm_set_cr0.
>>
>> Also, if I remember correctly, during nVMX's review, Avi Kivity asked
>> in several places that when I called vmx_set_cr0, I should instead call
>> kvm_set_cr0(), because it does some extra stuff and does some extra
>> checks. Hmm, see, see this:
>> 	http://markmail.org/message/hhidqyhbo2mrgxxc
>>
>> where Avi asked for the reverse patch you're attempting now.
> 
> At least, kvm_set_cr0 can't be used as it assumes an otherwise
> consistent guest state and an explicitly initiated transition - which is
> naturally not the case while emulating a vmexit.
> 
> However, there are some side effects that need a closer look, likely
> also in other places where vmx_set_cr0 is used directly. E.g.
> load_pdptrs. The lack of calling them on host->guest switch may explain
> the PAE issues you once saw with your nEPT implementation.
> 
> OK, more work...

Checked again, and the patch should be fine: load_pdptrs is executed by
kvm_set_cr3 later in this functions (same for prepare_vmcs02, so no
problem for PAE here), kvm_mmu_reset_context is even run explicitly and
clearing the async-pf queue makes no sense when leaving the guest (the
guest can't use it unless L1 is broken and passes this through).
However, we should clear the queue when leaving the host - separate patch.

Jan
Jan Kiszka Feb. 24, 2013, 8:34 a.m. UTC | #4
On 2013-02-23 23:21, Jan Kiszka wrote:
> On 2013-02-23 22:57, Jan Kiszka wrote:
>> On 2013-02-23 22:45, Nadav Har'El wrote:
>>> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state":
>>>> -	kvm_set_cr0(vcpu, vmcs12->host_cr0);
>>>> +	vmx_set_cr0(vcpu, vmcs12->host_cr0);
>>>
>>> I don't remember now why I did this (and I'm not looking at the code),
>>> but this you'll need to really test carefully, including
>>> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any
>>> important side-effect of kvm_set_cr0.
>>>
>>> Also, if I remember correctly, during nVMX's review, Avi Kivity asked
>>> in several places that when I called vmx_set_cr0, I should instead call
>>> kvm_set_cr0(), because it does some extra stuff and does some extra
>>> checks. Hmm, see, see this:
>>> 	http://markmail.org/message/hhidqyhbo2mrgxxc
>>>
>>> where Avi asked for the reverse patch you're attempting now.
>>
>> At least, kvm_set_cr0 can't be used as it assumes an otherwise
>> consistent guest state and an explicitly initiated transition - which is
>> naturally not the case while emulating a vmexit.
>>
>> However, there are some side effects that need a closer look, likely
>> also in other places where vmx_set_cr0 is used directly. E.g.
>> load_pdptrs. The lack of calling them on host->guest switch may explain
>> the PAE issues you once saw with your nEPT implementation.
>>
>> OK, more work...
> 
> Checked again, and the patch should be fine: load_pdptrs is executed by
> kvm_set_cr3 later in this functions (same for prepare_vmcs02, so no
> problem for PAE here), kvm_mmu_reset_context is even run explicitly and
> clearing the async-pf queue makes no sense when leaving the guest (the
> guest can't use it unless L1 is broken and passes this through).
> However, we should clear the queue when leaving the host - separate patch.

Thought about the async-pf thing again: It rather looks to me like the
host does not have to worry about it at all. It's L1's job to either
disable async-pf before entering L2 or trap PFs and handle paravirtual
reasons before they hit L2. I don't see that KVM already does this, but
I don't think that matters at this point.

Jan
Avi Kivity Feb. 24, 2013, 8:56 a.m. UTC | #5
On Sat, Feb 23, 2013 at 11:57 PM, Jan Kiszka <jan.kiszka@web.de> wrote:
> On 2013-02-23 22:45, Nadav Har'El wrote:
>> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state":
>>> -    kvm_set_cr0(vcpu, vmcs12->host_cr0);
>>> +    vmx_set_cr0(vcpu, vmcs12->host_cr0);
>>
>> I don't remember now why I did this (and I'm not looking at the code),
>> but this you'll need to really test carefully, including
>> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any
>> important side-effect of kvm_set_cr0.
>>
>> Also, if I remember correctly, during nVMX's review, Avi Kivity asked
>> in several places that when I called vmx_set_cr0, I should instead call
>> kvm_set_cr0(), because it does some extra stuff and does some extra
>> checks. Hmm, see, see this:
>>       http://markmail.org/message/hhidqyhbo2mrgxxc
>>
>> where Avi asked for the reverse patch you're attempting now.
>
> At least, kvm_set_cr0 can't be used as it assumes an otherwise
> consistent guest state and an explicitly initiated transition - which is
> naturally not the case while emulating a vmexit.

We have the same problem in KVM_SET_SREGS.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka Feb. 24, 2013, 9:01 a.m. UTC | #6
On 2013-02-24 09:56, Avi Kivity wrote:
> On Sat, Feb 23, 2013 at 11:57 PM, Jan Kiszka <jan.kiszka@web.de> wrote:
>> On 2013-02-23 22:45, Nadav Har'El wrote:
>>> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state":
>>>> -    kvm_set_cr0(vcpu, vmcs12->host_cr0);
>>>> +    vmx_set_cr0(vcpu, vmcs12->host_cr0);
>>>
>>> I don't remember now why I did this (and I'm not looking at the code),
>>> but this you'll need to really test carefully, including
>>> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any
>>> important side-effect of kvm_set_cr0.
>>>
>>> Also, if I remember correctly, during nVMX's review, Avi Kivity asked
>>> in several places that when I called vmx_set_cr0, I should instead call
>>> kvm_set_cr0(), because it does some extra stuff and does some extra
>>> checks. Hmm, see, see this:
>>>       http://markmail.org/message/hhidqyhbo2mrgxxc
>>>
>>> where Avi asked for the reverse patch you're attempting now.
>>
>> At least, kvm_set_cr0 can't be used as it assumes an otherwise
>> consistent guest state and an explicitly initiated transition - which is
>> naturally not the case while emulating a vmexit.
> 
> We have the same problem in KVM_SET_SREGS.

I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the
state update, not applying any transition checks.

Jan
Avi Kivity Feb. 24, 2013, 9:21 a.m. UTC | #7
On Sun, Feb 24, 2013 at 11:01 AM, Jan Kiszka <jan.kiszka@web.de> wrote:
> On 2013-02-24 09:56, Avi Kivity wrote:
>> On Sat, Feb 23, 2013 at 11:57 PM, Jan Kiszka <jan.kiszka@web.de> wrote:
>>> On 2013-02-23 22:45, Nadav Har'El wrote:
>>>> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state":
>>>>> -    kvm_set_cr0(vcpu, vmcs12->host_cr0);
>>>>> +    vmx_set_cr0(vcpu, vmcs12->host_cr0);
>>>>
>>>> I don't remember now why I did this (and I'm not looking at the code),
>>>> but this you'll need to really test carefully, including
>>>> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any
>>>> important side-effect of kvm_set_cr0.
>>>>
>>>> Also, if I remember correctly, during nVMX's review, Avi Kivity asked
>>>> in several places that when I called vmx_set_cr0, I should instead call
>>>> kvm_set_cr0(), because it does some extra stuff and does some extra
>>>> checks. Hmm, see, see this:
>>>>       http://markmail.org/message/hhidqyhbo2mrgxxc
>>>>
>>>> where Avi asked for the reverse patch you're attempting now.
>>>
>>> At least, kvm_set_cr0 can't be used as it assumes an otherwise
>>> consistent guest state and an explicitly initiated transition - which is
>>> naturally not the case while emulating a vmexit.
>>
>> We have the same problem in KVM_SET_SREGS.
>
> I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the
> state update, not applying any transition checks.

That's the problem.  We have this open coding in three different
places (KVM_SET_SREGS, nvmx, nsvm).

It's not as if vmx_set_cr0() is defined as "kvm_set_cr0() without the
transition checks".
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka Feb. 24, 2013, 9:40 a.m. UTC | #8
On 2013-02-24 10:21, Avi Kivity wrote:
> On Sun, Feb 24, 2013 at 11:01 AM, Jan Kiszka <jan.kiszka@web.de> wrote:
>> On 2013-02-24 09:56, Avi Kivity wrote:
>>> On Sat, Feb 23, 2013 at 11:57 PM, Jan Kiszka <jan.kiszka@web.de> wrote:
>>>> On 2013-02-23 22:45, Nadav Har'El wrote:
>>>>> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state":
>>>>>> -    kvm_set_cr0(vcpu, vmcs12->host_cr0);
>>>>>> +    vmx_set_cr0(vcpu, vmcs12->host_cr0);
>>>>>
>>>>> I don't remember now why I did this (and I'm not looking at the code),
>>>>> but this you'll need to really test carefully, including
>>>>> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any
>>>>> important side-effect of kvm_set_cr0.
>>>>>
>>>>> Also, if I remember correctly, during nVMX's review, Avi Kivity asked
>>>>> in several places that when I called vmx_set_cr0, I should instead call
>>>>> kvm_set_cr0(), because it does some extra stuff and does some extra
>>>>> checks. Hmm, see, see this:
>>>>>       http://markmail.org/message/hhidqyhbo2mrgxxc
>>>>>
>>>>> where Avi asked for the reverse patch you're attempting now.
>>>>
>>>> At least, kvm_set_cr0 can't be used as it assumes an otherwise
>>>> consistent guest state and an explicitly initiated transition - which is
>>>> naturally not the case while emulating a vmexit.
>>>
>>> We have the same problem in KVM_SET_SREGS.
>>
>> I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the
>> state update, not applying any transition checks.
> 
> That's the problem.  We have this open coding in three different
> places (KVM_SET_SREGS, nvmx, nsvm).
> 
> It's not as if vmx_set_cr0() is defined as "kvm_set_cr0() without the
> transition checks".

...and without mmu updates. The latter is done via or after the closing
cr3 update. Interestingly, nsvm does not perform kvm_set_cr3 on vmexit
when in npt mode. Seems things aren't that regular.

Jan
Avi Kivity Feb. 24, 2013, 10:11 a.m. UTC | #9
On Sun, Feb 24, 2013 at 11:40 AM, Jan Kiszka <jan.kiszka@web.de> wrote:
>>>> We have the same problem in KVM_SET_SREGS.
>>>
>>> I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the
>>> state update, not applying any transition checks.
>>
>> That's the problem.  We have this open coding in three different
>> places (KVM_SET_SREGS, nvmx, nsvm).
>>
>> It's not as if vmx_set_cr0() is defined as "kvm_set_cr0() without the
>> transition checks".
>
> ...and without mmu updates. The latter is done via or after the closing
> cr3 update. Interestingly, nsvm does not perform kvm_set_cr3 on vmexit
> when in npt mode. Seems things aren't that regular.

We do want the mmu updates.  Of course they can't be attached to
kvm_set_cr0_without_the_checks() since there is cross-register
dependencies.

Option 1 is kvm_set_multiple_cr().  This does the checks and updates,
but only after all registers are updated.
Option 2 is kvm_begin_cr_transaction()/kvm_commit_cr_transaction().
Prettier and more flexible, but a more complicated to implement.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka Feb. 24, 2013, 10:49 a.m. UTC | #10
On 2013-02-24 11:11, Avi Kivity wrote:
> On Sun, Feb 24, 2013 at 11:40 AM, Jan Kiszka <jan.kiszka@web.de> wrote:
>>>>> We have the same problem in KVM_SET_SREGS.
>>>>
>>>> I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the
>>>> state update, not applying any transition checks.
>>>
>>> That's the problem.  We have this open coding in three different
>>> places (KVM_SET_SREGS, nvmx, nsvm).
>>>
>>> It's not as if vmx_set_cr0() is defined as "kvm_set_cr0() without the
>>> transition checks".
>>
>> ...and without mmu updates. The latter is done via or after the closing
>> cr3 update. Interestingly, nsvm does not perform kvm_set_cr3 on vmexit
>> when in npt mode. Seems things aren't that regular.
> 
> We do want the mmu updates.  Of course they can't be attached to
> kvm_set_cr0_without_the_checks() since there is cross-register
> dependencies.
> 
> Option 1 is kvm_set_multiple_cr().  This does the checks and updates,
> but only after all registers are updated.
> Option 2 is kvm_begin_cr_transaction()/kvm_commit_cr_transaction().
> Prettier and more flexible, but a more complicated to implement.
> 

The only thing that these three use case truly have in common is the
closing kvm_mmu_reset_context. Maybe nvmx and nsvm can share a bit more.
But let's get nVMX right first, then think about sharing.

Jan
Avi Kivity Feb. 24, 2013, 6:56 p.m. UTC | #11
On Sun, Feb 24, 2013 at 12:49 PM, Jan Kiszka <jan.kiszka@web.de> wrote:
> On 2013-02-24 11:11, Avi Kivity wrote:
>> On Sun, Feb 24, 2013 at 11:40 AM, Jan Kiszka <jan.kiszka@web.de> wrote:
>>>>>> We have the same problem in KVM_SET_SREGS.
>>>>>
>>>>> I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the
>>>>> state update, not applying any transition checks.
>>>>
>>>> That's the problem.  We have this open coding in three different
>>>> places (KVM_SET_SREGS, nvmx, nsvm).
>>>>
>>>> It's not as if vmx_set_cr0() is defined as "kvm_set_cr0() without the
>>>> transition checks".
>>>
>>> ...and without mmu updates. The latter is done via or after the closing
>>> cr3 update. Interestingly, nsvm does not perform kvm_set_cr3 on vmexit
>>> when in npt mode. Seems things aren't that regular.
>>
>> We do want the mmu updates.  Of course they can't be attached to
>> kvm_set_cr0_without_the_checks() since there is cross-register
>> dependencies.
>>
>> Option 1 is kvm_set_multiple_cr().  This does the checks and updates,
>> but only after all registers are updated.
>> Option 2 is kvm_begin_cr_transaction()/kvm_commit_cr_transaction().
>> Prettier and more flexible, but a more complicated to implement.
>>
>
> The only thing that these three use case truly have in common is the
> closing kvm_mmu_reset_context. Maybe nvmx and nsvm can share a bit more.
> But let's get nVMX right first, then think about sharing.

They all need consistency checks, otherwise userspace or the guest and
inject inconsistent values and perhaps exploit the host.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka Feb. 24, 2013, 7:15 p.m. UTC | #12
On 2013-02-24 19:56, Avi Kivity wrote:
> On Sun, Feb 24, 2013 at 12:49 PM, Jan Kiszka <jan.kiszka@web.de> wrote:
>> On 2013-02-24 11:11, Avi Kivity wrote:
>>> On Sun, Feb 24, 2013 at 11:40 AM, Jan Kiszka <jan.kiszka@web.de> wrote:
>>>>>>> We have the same problem in KVM_SET_SREGS.
>>>>>>
>>>>>> I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the
>>>>>> state update, not applying any transition checks.
>>>>>
>>>>> That's the problem.  We have this open coding in three different
>>>>> places (KVM_SET_SREGS, nvmx, nsvm).
>>>>>
>>>>> It's not as if vmx_set_cr0() is defined as "kvm_set_cr0() without the
>>>>> transition checks".
>>>>
>>>> ...and without mmu updates. The latter is done via or after the closing
>>>> cr3 update. Interestingly, nsvm does not perform kvm_set_cr3 on vmexit
>>>> when in npt mode. Seems things aren't that regular.
>>>
>>> We do want the mmu updates.  Of course they can't be attached to
>>> kvm_set_cr0_without_the_checks() since there is cross-register
>>> dependencies.
>>>
>>> Option 1 is kvm_set_multiple_cr().  This does the checks and updates,
>>> but only after all registers are updated.
>>> Option 2 is kvm_begin_cr_transaction()/kvm_commit_cr_transaction().
>>> Prettier and more flexible, but a more complicated to implement.
>>>
>>
>> The only thing that these three use case truly have in common is the
>> closing kvm_mmu_reset_context. Maybe nvmx and nsvm can share a bit more.
>> But let's get nVMX right first, then think about sharing.
> 
> They all need consistency checks, otherwise userspace or the guest and
> inject inconsistent values and perhaps exploit the host.

To my understanding, the hardware does this for us: If we try to enter
the guest (L1, L2) with invalid CRx bits set or cleared, we get an
error, at least on Intel. But I bet AMD does so as well - and, if not,
it would make this test specific again.

Jan
Avi Kivity Feb. 24, 2013, 7:26 p.m. UTC | #13
On Sun, Feb 24, 2013 at 9:15 PM, Jan Kiszka <jan.kiszka@web.de> wrote:
>>
>> They all need consistency checks, otherwise userspace or the guest and
>> inject inconsistent values and perhaps exploit the host.
>
> To my understanding, the hardware does this for us: If we try to enter
> the guest (L1, L2) with invalid CRx bits set or cleared, we get an
> error, at least on Intel. But I bet AMD does so as well - and, if not,
> it would make this test specific again.

The point is that kvm code may depend on this consistency, so we fail
before that hardware has made the check.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka April 28, 2013, 10:20 a.m. UTC | #14
On 2013-02-23 22:35, Jan Kiszka wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> Likely a typo, but a fatal one as kvm_set_cr0 performs checks on the
> state transition that may prevent loading L1's cr0.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
>  arch/x86/kvm/vmx.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 26d47e9..94f3b66 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -7429,7 +7429,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
>  	 * fpu_active (which may have changed).
>  	 * Note that vmx_set_cr0 refers to efer set above.
>  	 */
> -	kvm_set_cr0(vcpu, vmcs12->host_cr0);
> +	vmx_set_cr0(vcpu, vmcs12->host_cr0);
>  	/*
>  	 * If we did fpu_activate()/fpu_deactivate() during L2's run, we need
>  	 * to apply the same changes to L1's vmcs. We just set cr0 correctly,
> 

This one still applies, is necessary for nested unrestricted guest mode,
and I'm still convinced it's an appropriate way to fix the bug. How to
proceed?

Jan
Gleb Natapov April 30, 2013, 11:46 a.m. UTC | #15
On Sun, Apr 28, 2013 at 12:20:38PM +0200, Jan Kiszka wrote:
> On 2013-02-23 22:35, Jan Kiszka wrote:
> > From: Jan Kiszka <jan.kiszka@siemens.com>
> > 
> > Likely a typo, but a fatal one as kvm_set_cr0 performs checks on the
> > state transition that may prevent loading L1's cr0.
> > 
> > Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> > ---
> >  arch/x86/kvm/vmx.c |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > index 26d47e9..94f3b66 100644
> > --- a/arch/x86/kvm/vmx.c
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -7429,7 +7429,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
> >  	 * fpu_active (which may have changed).
> >  	 * Note that vmx_set_cr0 refers to efer set above.
> >  	 */
> > -	kvm_set_cr0(vcpu, vmcs12->host_cr0);
> > +	vmx_set_cr0(vcpu, vmcs12->host_cr0);
> >  	/*
> >  	 * If we did fpu_activate()/fpu_deactivate() during L2's run, we need
> >  	 * to apply the same changes to L1's vmcs. We just set cr0 correctly,
> > 
> 
> This one still applies, is necessary for nested unrestricted guest mode,
> and I'm still convinced it's an appropriate way to fix the bug. How to
> proceed?
> 
What check that is done by kvm_set_cr0() fails?

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka April 30, 2013, 12:42 p.m. UTC | #16
On 2013-04-30 13:46, Gleb Natapov wrote:
> On Sun, Apr 28, 2013 at 12:20:38PM +0200, Jan Kiszka wrote:
>> On 2013-02-23 22:35, Jan Kiszka wrote:
>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>
>>> Likely a typo, but a fatal one as kvm_set_cr0 performs checks on the
>>> state transition that may prevent loading L1's cr0.
>>>
>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>> ---
>>>  arch/x86/kvm/vmx.c |    2 +-
>>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index 26d47e9..94f3b66 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -7429,7 +7429,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
>>>  	 * fpu_active (which may have changed).
>>>  	 * Note that vmx_set_cr0 refers to efer set above.
>>>  	 */
>>> -	kvm_set_cr0(vcpu, vmcs12->host_cr0);
>>> +	vmx_set_cr0(vcpu, vmcs12->host_cr0);
>>>  	/*
>>>  	 * If we did fpu_activate()/fpu_deactivate() during L2's run, we need
>>>  	 * to apply the same changes to L1's vmcs. We just set cr0 correctly,
>>>
>>
>> This one still applies, is necessary for nested unrestricted guest mode,
>> and I'm still convinced it's an appropriate way to fix the bug. How to
>> proceed?
>>
> What check that is done by kvm_set_cr0() fails?

Would have to reproduce the bug to confirm, but from the top of my head
and from looking at the code again:

if (!is_paging(vcpu) && (cr0 & X86_CR0_PG)) {
	if ((vcpu->arch.efer & EFER_LME)) {
		int cs_db, cs_l;

		if (!is_pae(vcpu))
			return 1;
		kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
		if (cs_l)
			return 1;

I think to remember this last check triggered. When we come from the
guest with paging off, we may run through this check an incorrectly bail
out here when the host state fulfills the conditions (PG, EFER_LME, and
L bit set).

Jan
Jan Kiszka May 5, 2013, 9:02 a.m. UTC | #17
On 2013-04-30 14:42, Jan Kiszka wrote:
> On 2013-04-30 13:46, Gleb Natapov wrote:
>> On Sun, Apr 28, 2013 at 12:20:38PM +0200, Jan Kiszka wrote:
>>> On 2013-02-23 22:35, Jan Kiszka wrote:
>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>>
>>>> Likely a typo, but a fatal one as kvm_set_cr0 performs checks on the
>>>> state transition that may prevent loading L1's cr0.
>>>>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>> ---
>>>>  arch/x86/kvm/vmx.c |    2 +-
>>>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>> index 26d47e9..94f3b66 100644
>>>> --- a/arch/x86/kvm/vmx.c
>>>> +++ b/arch/x86/kvm/vmx.c
>>>> @@ -7429,7 +7429,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
>>>>  	 * fpu_active (which may have changed).
>>>>  	 * Note that vmx_set_cr0 refers to efer set above.
>>>>  	 */
>>>> -	kvm_set_cr0(vcpu, vmcs12->host_cr0);
>>>> +	vmx_set_cr0(vcpu, vmcs12->host_cr0);
>>>>  	/*
>>>>  	 * If we did fpu_activate()/fpu_deactivate() during L2's run, we need
>>>>  	 * to apply the same changes to L1's vmcs. We just set cr0 correctly,
>>>>
>>>
>>> This one still applies, is necessary for nested unrestricted guest mode,
>>> and I'm still convinced it's an appropriate way to fix the bug. How to
>>> proceed?
>>>
>> What check that is done by kvm_set_cr0() fails?
> 
> Would have to reproduce the bug to confirm, but from the top of my head
> and from looking at the code again:
> 
> if (!is_paging(vcpu) && (cr0 & X86_CR0_PG)) {
> 	if ((vcpu->arch.efer & EFER_LME)) {
> 		int cs_db, cs_l;
> 
> 		if (!is_pae(vcpu))
> 			return 1;
> 		kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
> 		if (cs_l)
> 			return 1;
> 
> I think to remember this last check triggered. When we come from the
> guest with paging off, we may run through this check an incorrectly bail
> out here when the host state fulfills the conditions (PG, EFER_LME, and
> L bit set).

Just retried, and actually the first check (!is_pae) fails right now
(with nested unrestricted guest mode patched in). The second one
stumbles if I set CR4 before CR1 in load_vmcs12_host_state.

So, however you put it, calling kvm_set_cr0 remains wrong.

Jan
diff mbox

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 26d47e9..94f3b66 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7429,7 +7429,7 @@  static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
 	 * fpu_active (which may have changed).
 	 * Note that vmx_set_cr0 refers to efer set above.
 	 */
-	kvm_set_cr0(vcpu, vmcs12->host_cr0);
+	vmx_set_cr0(vcpu, vmcs12->host_cr0);
 	/*
 	 * If we did fpu_activate()/fpu_deactivate() during L2's run, we need
 	 * to apply the same changes to L1's vmcs. We just set cr0 correctly,