Message ID | 5129361A.7090608@web.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state": > - kvm_set_cr0(vcpu, vmcs12->host_cr0); > + vmx_set_cr0(vcpu, vmcs12->host_cr0); I don't remember now why I did this (and I'm not looking at the code), but this you'll need to really test carefully, including shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any important side-effect of kvm_set_cr0. Also, if I remember correctly, during nVMX's review, Avi Kivity asked in several places that when I called vmx_set_cr0, I should instead call kvm_set_cr0(), because it does some extra stuff and does some extra checks. Hmm, see, see this: http://markmail.org/message/hhidqyhbo2mrgxxc where Avi asked for the reverse patch you're attempting now.
On 2013-02-23 22:45, Nadav Har'El wrote: > On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state": >> - kvm_set_cr0(vcpu, vmcs12->host_cr0); >> + vmx_set_cr0(vcpu, vmcs12->host_cr0); > > I don't remember now why I did this (and I'm not looking at the code), > but this you'll need to really test carefully, including > shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any > important side-effect of kvm_set_cr0. > > Also, if I remember correctly, during nVMX's review, Avi Kivity asked > in several places that when I called vmx_set_cr0, I should instead call > kvm_set_cr0(), because it does some extra stuff and does some extra > checks. Hmm, see, see this: > http://markmail.org/message/hhidqyhbo2mrgxxc > > where Avi asked for the reverse patch you're attempting now. At least, kvm_set_cr0 can't be used as it assumes an otherwise consistent guest state and an explicitly initiated transition - which is naturally not the case while emulating a vmexit. However, there are some side effects that need a closer look, likely also in other places where vmx_set_cr0 is used directly. E.g. load_pdptrs. The lack of calling them on host->guest switch may explain the PAE issues you once saw with your nEPT implementation. OK, more work... Jan
On 2013-02-23 22:57, Jan Kiszka wrote: > On 2013-02-23 22:45, Nadav Har'El wrote: >> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state": >>> - kvm_set_cr0(vcpu, vmcs12->host_cr0); >>> + vmx_set_cr0(vcpu, vmcs12->host_cr0); >> >> I don't remember now why I did this (and I'm not looking at the code), >> but this you'll need to really test carefully, including >> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any >> important side-effect of kvm_set_cr0. >> >> Also, if I remember correctly, during nVMX's review, Avi Kivity asked >> in several places that when I called vmx_set_cr0, I should instead call >> kvm_set_cr0(), because it does some extra stuff and does some extra >> checks. Hmm, see, see this: >> http://markmail.org/message/hhidqyhbo2mrgxxc >> >> where Avi asked for the reverse patch you're attempting now. > > At least, kvm_set_cr0 can't be used as it assumes an otherwise > consistent guest state and an explicitly initiated transition - which is > naturally not the case while emulating a vmexit. > > However, there are some side effects that need a closer look, likely > also in other places where vmx_set_cr0 is used directly. E.g. > load_pdptrs. The lack of calling them on host->guest switch may explain > the PAE issues you once saw with your nEPT implementation. > > OK, more work... Checked again, and the patch should be fine: load_pdptrs is executed by kvm_set_cr3 later in this functions (same for prepare_vmcs02, so no problem for PAE here), kvm_mmu_reset_context is even run explicitly and clearing the async-pf queue makes no sense when leaving the guest (the guest can't use it unless L1 is broken and passes this through). However, we should clear the queue when leaving the host - separate patch. Jan
On 2013-02-23 23:21, Jan Kiszka wrote: > On 2013-02-23 22:57, Jan Kiszka wrote: >> On 2013-02-23 22:45, Nadav Har'El wrote: >>> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state": >>>> - kvm_set_cr0(vcpu, vmcs12->host_cr0); >>>> + vmx_set_cr0(vcpu, vmcs12->host_cr0); >>> >>> I don't remember now why I did this (and I'm not looking at the code), >>> but this you'll need to really test carefully, including >>> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any >>> important side-effect of kvm_set_cr0. >>> >>> Also, if I remember correctly, during nVMX's review, Avi Kivity asked >>> in several places that when I called vmx_set_cr0, I should instead call >>> kvm_set_cr0(), because it does some extra stuff and does some extra >>> checks. Hmm, see, see this: >>> http://markmail.org/message/hhidqyhbo2mrgxxc >>> >>> where Avi asked for the reverse patch you're attempting now. >> >> At least, kvm_set_cr0 can't be used as it assumes an otherwise >> consistent guest state and an explicitly initiated transition - which is >> naturally not the case while emulating a vmexit. >> >> However, there are some side effects that need a closer look, likely >> also in other places where vmx_set_cr0 is used directly. E.g. >> load_pdptrs. The lack of calling them on host->guest switch may explain >> the PAE issues you once saw with your nEPT implementation. >> >> OK, more work... > > Checked again, and the patch should be fine: load_pdptrs is executed by > kvm_set_cr3 later in this functions (same for prepare_vmcs02, so no > problem for PAE here), kvm_mmu_reset_context is even run explicitly and > clearing the async-pf queue makes no sense when leaving the guest (the > guest can't use it unless L1 is broken and passes this through). > However, we should clear the queue when leaving the host - separate patch. Thought about the async-pf thing again: It rather looks to me like the host does not have to worry about it at all. It's L1's job to either disable async-pf before entering L2 or trap PFs and handle paravirtual reasons before they hit L2. I don't see that KVM already does this, but I don't think that matters at this point. Jan
On Sat, Feb 23, 2013 at 11:57 PM, Jan Kiszka <jan.kiszka@web.de> wrote: > On 2013-02-23 22:45, Nadav Har'El wrote: >> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state": >>> - kvm_set_cr0(vcpu, vmcs12->host_cr0); >>> + vmx_set_cr0(vcpu, vmcs12->host_cr0); >> >> I don't remember now why I did this (and I'm not looking at the code), >> but this you'll need to really test carefully, including >> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any >> important side-effect of kvm_set_cr0. >> >> Also, if I remember correctly, during nVMX's review, Avi Kivity asked >> in several places that when I called vmx_set_cr0, I should instead call >> kvm_set_cr0(), because it does some extra stuff and does some extra >> checks. Hmm, see, see this: >> http://markmail.org/message/hhidqyhbo2mrgxxc >> >> where Avi asked for the reverse patch you're attempting now. > > At least, kvm_set_cr0 can't be used as it assumes an otherwise > consistent guest state and an explicitly initiated transition - which is > naturally not the case while emulating a vmexit. We have the same problem in KVM_SET_SREGS. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-02-24 09:56, Avi Kivity wrote: > On Sat, Feb 23, 2013 at 11:57 PM, Jan Kiszka <jan.kiszka@web.de> wrote: >> On 2013-02-23 22:45, Nadav Har'El wrote: >>> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state": >>>> - kvm_set_cr0(vcpu, vmcs12->host_cr0); >>>> + vmx_set_cr0(vcpu, vmcs12->host_cr0); >>> >>> I don't remember now why I did this (and I'm not looking at the code), >>> but this you'll need to really test carefully, including >>> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any >>> important side-effect of kvm_set_cr0. >>> >>> Also, if I remember correctly, during nVMX's review, Avi Kivity asked >>> in several places that when I called vmx_set_cr0, I should instead call >>> kvm_set_cr0(), because it does some extra stuff and does some extra >>> checks. Hmm, see, see this: >>> http://markmail.org/message/hhidqyhbo2mrgxxc >>> >>> where Avi asked for the reverse patch you're attempting now. >> >> At least, kvm_set_cr0 can't be used as it assumes an otherwise >> consistent guest state and an explicitly initiated transition - which is >> naturally not the case while emulating a vmexit. > > We have the same problem in KVM_SET_SREGS. I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the state update, not applying any transition checks. Jan
On Sun, Feb 24, 2013 at 11:01 AM, Jan Kiszka <jan.kiszka@web.de> wrote: > On 2013-02-24 09:56, Avi Kivity wrote: >> On Sat, Feb 23, 2013 at 11:57 PM, Jan Kiszka <jan.kiszka@web.de> wrote: >>> On 2013-02-23 22:45, Nadav Har'El wrote: >>>> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state": >>>>> - kvm_set_cr0(vcpu, vmcs12->host_cr0); >>>>> + vmx_set_cr0(vcpu, vmcs12->host_cr0); >>>> >>>> I don't remember now why I did this (and I'm not looking at the code), >>>> but this you'll need to really test carefully, including >>>> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any >>>> important side-effect of kvm_set_cr0. >>>> >>>> Also, if I remember correctly, during nVMX's review, Avi Kivity asked >>>> in several places that when I called vmx_set_cr0, I should instead call >>>> kvm_set_cr0(), because it does some extra stuff and does some extra >>>> checks. Hmm, see, see this: >>>> http://markmail.org/message/hhidqyhbo2mrgxxc >>>> >>>> where Avi asked for the reverse patch you're attempting now. >>> >>> At least, kvm_set_cr0 can't be used as it assumes an otherwise >>> consistent guest state and an explicitly initiated transition - which is >>> naturally not the case while emulating a vmexit. >> >> We have the same problem in KVM_SET_SREGS. > > I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the > state update, not applying any transition checks. That's the problem. We have this open coding in three different places (KVM_SET_SREGS, nvmx, nsvm). It's not as if vmx_set_cr0() is defined as "kvm_set_cr0() without the transition checks". -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-02-24 10:21, Avi Kivity wrote: > On Sun, Feb 24, 2013 at 11:01 AM, Jan Kiszka <jan.kiszka@web.de> wrote: >> On 2013-02-24 09:56, Avi Kivity wrote: >>> On Sat, Feb 23, 2013 at 11:57 PM, Jan Kiszka <jan.kiszka@web.de> wrote: >>>> On 2013-02-23 22:45, Nadav Har'El wrote: >>>>> On Sat, Feb 23, 2013, Jan Kiszka wrote about "[PATCH] KVM: nVMX: Replace kvm_set_cr0 with vmx_set_cr0 in load_vmcs12_host_state": >>>>>> - kvm_set_cr0(vcpu, vmcs12->host_cr0); >>>>>> + vmx_set_cr0(vcpu, vmcs12->host_cr0); >>>>> >>>>> I don't remember now why I did this (and I'm not looking at the code), >>>>> but this you'll need to really test carefully, including >>>>> shadow-on-shadow mode (ept=0 in L0), to verify you're not missing any >>>>> important side-effect of kvm_set_cr0. >>>>> >>>>> Also, if I remember correctly, during nVMX's review, Avi Kivity asked >>>>> in several places that when I called vmx_set_cr0, I should instead call >>>>> kvm_set_cr0(), because it does some extra stuff and does some extra >>>>> checks. Hmm, see, see this: >>>>> http://markmail.org/message/hhidqyhbo2mrgxxc >>>>> >>>>> where Avi asked for the reverse patch you're attempting now. >>>> >>>> At least, kvm_set_cr0 can't be used as it assumes an otherwise >>>> consistent guest state and an explicitly initiated transition - which is >>>> naturally not the case while emulating a vmexit. >>> >>> We have the same problem in KVM_SET_SREGS. >> >> I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the >> state update, not applying any transition checks. > > That's the problem. We have this open coding in three different > places (KVM_SET_SREGS, nvmx, nsvm). > > It's not as if vmx_set_cr0() is defined as "kvm_set_cr0() without the > transition checks". ...and without mmu updates. The latter is done via or after the closing cr3 update. Interestingly, nsvm does not perform kvm_set_cr3 on vmexit when in npt mode. Seems things aren't that regular. Jan
On Sun, Feb 24, 2013 at 11:40 AM, Jan Kiszka <jan.kiszka@web.de> wrote: >>>> We have the same problem in KVM_SET_SREGS. >>> >>> I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the >>> state update, not applying any transition checks. >> >> That's the problem. We have this open coding in three different >> places (KVM_SET_SREGS, nvmx, nsvm). >> >> It's not as if vmx_set_cr0() is defined as "kvm_set_cr0() without the >> transition checks". > > ...and without mmu updates. The latter is done via or after the closing > cr3 update. Interestingly, nsvm does not perform kvm_set_cr3 on vmexit > when in npt mode. Seems things aren't that regular. We do want the mmu updates. Of course they can't be attached to kvm_set_cr0_without_the_checks() since there is cross-register dependencies. Option 1 is kvm_set_multiple_cr(). This does the checks and updates, but only after all registers are updated. Option 2 is kvm_begin_cr_transaction()/kvm_commit_cr_transaction(). Prettier and more flexible, but a more complicated to implement. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-02-24 11:11, Avi Kivity wrote: > On Sun, Feb 24, 2013 at 11:40 AM, Jan Kiszka <jan.kiszka@web.de> wrote: >>>>> We have the same problem in KVM_SET_SREGS. >>>> >>>> I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the >>>> state update, not applying any transition checks. >>> >>> That's the problem. We have this open coding in three different >>> places (KVM_SET_SREGS, nvmx, nsvm). >>> >>> It's not as if vmx_set_cr0() is defined as "kvm_set_cr0() without the >>> transition checks". >> >> ...and without mmu updates. The latter is done via or after the closing >> cr3 update. Interestingly, nsvm does not perform kvm_set_cr3 on vmexit >> when in npt mode. Seems things aren't that regular. > > We do want the mmu updates. Of course they can't be attached to > kvm_set_cr0_without_the_checks() since there is cross-register > dependencies. > > Option 1 is kvm_set_multiple_cr(). This does the checks and updates, > but only after all registers are updated. > Option 2 is kvm_begin_cr_transaction()/kvm_commit_cr_transaction(). > Prettier and more flexible, but a more complicated to implement. > The only thing that these three use case truly have in common is the closing kvm_mmu_reset_context. Maybe nvmx and nsvm can share a bit more. But let's get nVMX right first, then think about sharing. Jan
On Sun, Feb 24, 2013 at 12:49 PM, Jan Kiszka <jan.kiszka@web.de> wrote: > On 2013-02-24 11:11, Avi Kivity wrote: >> On Sun, Feb 24, 2013 at 11:40 AM, Jan Kiszka <jan.kiszka@web.de> wrote: >>>>>> We have the same problem in KVM_SET_SREGS. >>>>> >>>>> I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the >>>>> state update, not applying any transition checks. >>>> >>>> That's the problem. We have this open coding in three different >>>> places (KVM_SET_SREGS, nvmx, nsvm). >>>> >>>> It's not as if vmx_set_cr0() is defined as "kvm_set_cr0() without the >>>> transition checks". >>> >>> ...and without mmu updates. The latter is done via or after the closing >>> cr3 update. Interestingly, nsvm does not perform kvm_set_cr3 on vmexit >>> when in npt mode. Seems things aren't that regular. >> >> We do want the mmu updates. Of course they can't be attached to >> kvm_set_cr0_without_the_checks() since there is cross-register >> dependencies. >> >> Option 1 is kvm_set_multiple_cr(). This does the checks and updates, >> but only after all registers are updated. >> Option 2 is kvm_begin_cr_transaction()/kvm_commit_cr_transaction(). >> Prettier and more flexible, but a more complicated to implement. >> > > The only thing that these three use case truly have in common is the > closing kvm_mmu_reset_context. Maybe nvmx and nsvm can share a bit more. > But let's get nVMX right first, then think about sharing. They all need consistency checks, otherwise userspace or the guest and inject inconsistent values and perhaps exploit the host. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-02-24 19:56, Avi Kivity wrote: > On Sun, Feb 24, 2013 at 12:49 PM, Jan Kiszka <jan.kiszka@web.de> wrote: >> On 2013-02-24 11:11, Avi Kivity wrote: >>> On Sun, Feb 24, 2013 at 11:40 AM, Jan Kiszka <jan.kiszka@web.de> wrote: >>>>>>> We have the same problem in KVM_SET_SREGS. >>>>>> >>>>>> I don't see the problem. kvm_arch_vcpu_ioctl_set_sregs open-codes the >>>>>> state update, not applying any transition checks. >>>>> >>>>> That's the problem. We have this open coding in three different >>>>> places (KVM_SET_SREGS, nvmx, nsvm). >>>>> >>>>> It's not as if vmx_set_cr0() is defined as "kvm_set_cr0() without the >>>>> transition checks". >>>> >>>> ...and without mmu updates. The latter is done via or after the closing >>>> cr3 update. Interestingly, nsvm does not perform kvm_set_cr3 on vmexit >>>> when in npt mode. Seems things aren't that regular. >>> >>> We do want the mmu updates. Of course they can't be attached to >>> kvm_set_cr0_without_the_checks() since there is cross-register >>> dependencies. >>> >>> Option 1 is kvm_set_multiple_cr(). This does the checks and updates, >>> but only after all registers are updated. >>> Option 2 is kvm_begin_cr_transaction()/kvm_commit_cr_transaction(). >>> Prettier and more flexible, but a more complicated to implement. >>> >> >> The only thing that these three use case truly have in common is the >> closing kvm_mmu_reset_context. Maybe nvmx and nsvm can share a bit more. >> But let's get nVMX right first, then think about sharing. > > They all need consistency checks, otherwise userspace or the guest and > inject inconsistent values and perhaps exploit the host. To my understanding, the hardware does this for us: If we try to enter the guest (L1, L2) with invalid CRx bits set or cleared, we get an error, at least on Intel. But I bet AMD does so as well - and, if not, it would make this test specific again. Jan
On Sun, Feb 24, 2013 at 9:15 PM, Jan Kiszka <jan.kiszka@web.de> wrote: >> >> They all need consistency checks, otherwise userspace or the guest and >> inject inconsistent values and perhaps exploit the host. > > To my understanding, the hardware does this for us: If we try to enter > the guest (L1, L2) with invalid CRx bits set or cleared, we get an > error, at least on Intel. But I bet AMD does so as well - and, if not, > it would make this test specific again. The point is that kvm code may depend on this consistency, so we fail before that hardware has made the check. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-02-23 22:35, Jan Kiszka wrote: > From: Jan Kiszka <jan.kiszka@siemens.com> > > Likely a typo, but a fatal one as kvm_set_cr0 performs checks on the > state transition that may prevent loading L1's cr0. > > Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> > --- > arch/x86/kvm/vmx.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 26d47e9..94f3b66 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -7429,7 +7429,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, > * fpu_active (which may have changed). > * Note that vmx_set_cr0 refers to efer set above. > */ > - kvm_set_cr0(vcpu, vmcs12->host_cr0); > + vmx_set_cr0(vcpu, vmcs12->host_cr0); > /* > * If we did fpu_activate()/fpu_deactivate() during L2's run, we need > * to apply the same changes to L1's vmcs. We just set cr0 correctly, > This one still applies, is necessary for nested unrestricted guest mode, and I'm still convinced it's an appropriate way to fix the bug. How to proceed? Jan
On Sun, Apr 28, 2013 at 12:20:38PM +0200, Jan Kiszka wrote: > On 2013-02-23 22:35, Jan Kiszka wrote: > > From: Jan Kiszka <jan.kiszka@siemens.com> > > > > Likely a typo, but a fatal one as kvm_set_cr0 performs checks on the > > state transition that may prevent loading L1's cr0. > > > > Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> > > --- > > arch/x86/kvm/vmx.c | 2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) > > > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > > index 26d47e9..94f3b66 100644 > > --- a/arch/x86/kvm/vmx.c > > +++ b/arch/x86/kvm/vmx.c > > @@ -7429,7 +7429,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, > > * fpu_active (which may have changed). > > * Note that vmx_set_cr0 refers to efer set above. > > */ > > - kvm_set_cr0(vcpu, vmcs12->host_cr0); > > + vmx_set_cr0(vcpu, vmcs12->host_cr0); > > /* > > * If we did fpu_activate()/fpu_deactivate() during L2's run, we need > > * to apply the same changes to L1's vmcs. We just set cr0 correctly, > > > > This one still applies, is necessary for nested unrestricted guest mode, > and I'm still convinced it's an appropriate way to fix the bug. How to > proceed? > What check that is done by kvm_set_cr0() fails? -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-04-30 13:46, Gleb Natapov wrote: > On Sun, Apr 28, 2013 at 12:20:38PM +0200, Jan Kiszka wrote: >> On 2013-02-23 22:35, Jan Kiszka wrote: >>> From: Jan Kiszka <jan.kiszka@siemens.com> >>> >>> Likely a typo, but a fatal one as kvm_set_cr0 performs checks on the >>> state transition that may prevent loading L1's cr0. >>> >>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> >>> --- >>> arch/x86/kvm/vmx.c | 2 +- >>> 1 files changed, 1 insertions(+), 1 deletions(-) >>> >>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >>> index 26d47e9..94f3b66 100644 >>> --- a/arch/x86/kvm/vmx.c >>> +++ b/arch/x86/kvm/vmx.c >>> @@ -7429,7 +7429,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, >>> * fpu_active (which may have changed). >>> * Note that vmx_set_cr0 refers to efer set above. >>> */ >>> - kvm_set_cr0(vcpu, vmcs12->host_cr0); >>> + vmx_set_cr0(vcpu, vmcs12->host_cr0); >>> /* >>> * If we did fpu_activate()/fpu_deactivate() during L2's run, we need >>> * to apply the same changes to L1's vmcs. We just set cr0 correctly, >>> >> >> This one still applies, is necessary for nested unrestricted guest mode, >> and I'm still convinced it's an appropriate way to fix the bug. How to >> proceed? >> > What check that is done by kvm_set_cr0() fails? Would have to reproduce the bug to confirm, but from the top of my head and from looking at the code again: if (!is_paging(vcpu) && (cr0 & X86_CR0_PG)) { if ((vcpu->arch.efer & EFER_LME)) { int cs_db, cs_l; if (!is_pae(vcpu)) return 1; kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l); if (cs_l) return 1; I think to remember this last check triggered. When we come from the guest with paging off, we may run through this check an incorrectly bail out here when the host state fulfills the conditions (PG, EFER_LME, and L bit set). Jan
On 2013-04-30 14:42, Jan Kiszka wrote: > On 2013-04-30 13:46, Gleb Natapov wrote: >> On Sun, Apr 28, 2013 at 12:20:38PM +0200, Jan Kiszka wrote: >>> On 2013-02-23 22:35, Jan Kiszka wrote: >>>> From: Jan Kiszka <jan.kiszka@siemens.com> >>>> >>>> Likely a typo, but a fatal one as kvm_set_cr0 performs checks on the >>>> state transition that may prevent loading L1's cr0. >>>> >>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> >>>> --- >>>> arch/x86/kvm/vmx.c | 2 +- >>>> 1 files changed, 1 insertions(+), 1 deletions(-) >>>> >>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >>>> index 26d47e9..94f3b66 100644 >>>> --- a/arch/x86/kvm/vmx.c >>>> +++ b/arch/x86/kvm/vmx.c >>>> @@ -7429,7 +7429,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, >>>> * fpu_active (which may have changed). >>>> * Note that vmx_set_cr0 refers to efer set above. >>>> */ >>>> - kvm_set_cr0(vcpu, vmcs12->host_cr0); >>>> + vmx_set_cr0(vcpu, vmcs12->host_cr0); >>>> /* >>>> * If we did fpu_activate()/fpu_deactivate() during L2's run, we need >>>> * to apply the same changes to L1's vmcs. We just set cr0 correctly, >>>> >>> >>> This one still applies, is necessary for nested unrestricted guest mode, >>> and I'm still convinced it's an appropriate way to fix the bug. How to >>> proceed? >>> >> What check that is done by kvm_set_cr0() fails? > > Would have to reproduce the bug to confirm, but from the top of my head > and from looking at the code again: > > if (!is_paging(vcpu) && (cr0 & X86_CR0_PG)) { > if ((vcpu->arch.efer & EFER_LME)) { > int cs_db, cs_l; > > if (!is_pae(vcpu)) > return 1; > kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l); > if (cs_l) > return 1; > > I think to remember this last check triggered. When we come from the > guest with paging off, we may run through this check an incorrectly bail > out here when the host state fulfills the conditions (PG, EFER_LME, and > L bit set). Just retried, and actually the first check (!is_pae) fails right now (with nested unrestricted guest mode patched in). The second one stumbles if I set CR4 before CR1 in load_vmcs12_host_state. So, however you put it, calling kvm_set_cr0 remains wrong. Jan
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 26d47e9..94f3b66 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -7429,7 +7429,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, * fpu_active (which may have changed). * Note that vmx_set_cr0 refers to efer set above. */ - kvm_set_cr0(vcpu, vmcs12->host_cr0); + vmx_set_cr0(vcpu, vmcs12->host_cr0); /* * If we did fpu_activate()/fpu_deactivate() during L2's run, we need * to apply the same changes to L1's vmcs. We just set cr0 correctly,