diff mbox series

KVM: x86: Don't attempt to load PDPTRs when 64-bit mode is enabled

Message ID 20200714015732.32426-1-sean.j.christopherson@intel.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: Don't attempt to load PDPTRs when 64-bit mode is enabled | expand

Commit Message

Sean Christopherson July 14, 2020, 1:57 a.m. UTC
Don't attempt to load PDPTRs if EFER.LME=1, i.e. if 64-bit mode is
enabled.  A recent change to reload the PDTPRs when CR0.CD or CR0.NW is
toggled botched the EFER.LME handling and sends KVM down the PDTPR path
when is_paging() is true, i.e. when the guest toggles CD/NW in 64-bit
mode.

Split the CR0 checks for 64-bit vs. 32-bit PAE into separate paths.  The
64-bit path is specifically checking state when paging is toggled on,
i.e. CR0.PG transititions from 0->1.  The PDPTR path now needs to run if
the new CR0 state has paging enabled, irrespective of whether paging was
already enabled.  Trying to shave a few cycles to make the PDPTR path an
"else if" case is a mess.

Fixes: d42e3fae6faed ("kvm: x86: Read PDPTEs on CR0.CD and CR0.NW changes")
Cc: Jim Mattson <jmattson@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Peter Shier <pshier@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---

The other way to fix this, with a much smaller diff stat, is to simply
move the !is_page(vcpu) check inside (vcpu->arch.efer & EFER_LME).  But
that results in a ridiculous amount of nested conditionals for what is a
very straightforward check e.g.

	if (cr0 & X86_CR0_PG) {
		if (vcpu->arch.efer & EFER_LME) }
			if (!is_paging(vcpu)) {
				...
			}
		}
	}

Since this doesn't need to be backported anywhere, I didn't see any value
in having an intermediate step.

 arch/x86/kvm/x86.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

Comments

Vitaly Kuznetsov July 14, 2020, noon UTC | #1
Sean Christopherson <sean.j.christopherson@intel.com> writes:

> Don't attempt to load PDPTRs if EFER.LME=1, i.e. if 64-bit mode is
> enabled.  A recent change to reload the PDTPRs when CR0.CD or CR0.NW is
> toggled botched the EFER.LME handling and sends KVM down the PDTPR path
> when is_paging() is true, i.e. when the guest toggles CD/NW in 64-bit
> mode.
>
> Split the CR0 checks for 64-bit vs. 32-bit PAE into separate paths.  The
> 64-bit path is specifically checking state when paging is toggled on,
> i.e. CR0.PG transititions from 0->1.  The PDPTR path now needs to run if
> the new CR0 state has paging enabled, irrespective of whether paging was
> already enabled.  Trying to shave a few cycles to make the PDPTR path an
> "else if" case is a mess.
>
> Fixes: d42e3fae6faed ("kvm: x86: Read PDPTEs on CR0.CD and CR0.NW changes")
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Oliver Upton <oupton@google.com>
> Cc: Peter Shier <pshier@google.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>
> The other way to fix this, with a much smaller diff stat, is to simply
> move the !is_page(vcpu) check inside (vcpu->arch.efer & EFER_LME).  But
> that results in a ridiculous amount of nested conditionals for what is a
> very straightforward check e.g.
>
> 	if (cr0 & X86_CR0_PG) {
> 		if (vcpu->arch.efer & EFER_LME) }
> 			if (!is_paging(vcpu)) {
> 				...
> 			}
> 		}
> 	}
>
> Since this doesn't need to be backported anywhere, I didn't see any value
> in having an intermediate step.
>
>  arch/x86/kvm/x86.c | 24 ++++++++++++------------
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 95ef629228691..5f526d94c33f3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -819,22 +819,22 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
>  	if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE))
>  		return 1;
>  
> -	if (cr0 & X86_CR0_PG) {
>  #ifdef CONFIG_X86_64
> -		if (!is_paging(vcpu) && (vcpu->arch.efer & EFER_LME)) {
> -			int cs_db, cs_l;
> +	if ((vcpu->arch.efer & EFER_LME) && !is_paging(vcpu) &&
> +	    (cr0 & X86_CR0_PG)) {

it seems we have more than one occurance of "if (vcpu->arch.efer &
EFER_LME)" under "#ifdef CONFIG_X86_64" and we alredy have 

static inline int is_long_mode(struct kvm_vcpu *vcpu)
{
#ifdef CONFIG_X86_64
     return vcpu->arch.efer & EFER_LMA;
#else
     return 0;
#endif
}

so if we use this instead, the compilers will just throw away the
non-reachable blocks when !(#ifdef CONFIG_X86_64), right?

> +		int cs_db, cs_l;
>  
> -			if (!is_pae(vcpu))
> -				return 1;
> -			kvm_x86_ops.get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
> -			if (cs_l)
> -				return 1;
> -		} else
> -#endif
> -		if (is_pae(vcpu) && ((cr0 ^ old_cr0) & pdptr_bits) &&
> -		    !load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu)))
> +		if (!is_pae(vcpu))
> +			return 1;
> +		kvm_x86_ops.get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
> +		if (cs_l)
>  			return 1;
>  	}
> +#endif
> +	if (!(vcpu->arch.efer & EFER_LME) && (cr0 & X86_CR0_PG) &&
> +	    is_pae(vcpu) && ((cr0 ^ old_cr0) & pdptr_bits) &&
> +	    !load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu)))
> +		return 1;
>  
>  	if (!(cr0 & X86_CR0_PG) && kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE))
>  		return 1;
Sean Christopherson July 14, 2020, 1:21 p.m. UTC | #2
On Tue, Jul 14, 2020 at 02:00:04PM +0200, Vitaly Kuznetsov wrote:
> Sean Christopherson <sean.j.christopherson@intel.com> writes:
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 95ef629228691..5f526d94c33f3 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -819,22 +819,22 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
> >  	if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE))
> >  		return 1;
> >  
> > -	if (cr0 & X86_CR0_PG) {
> >  #ifdef CONFIG_X86_64
> > -		if (!is_paging(vcpu) && (vcpu->arch.efer & EFER_LME)) {
> > -			int cs_db, cs_l;
> > +	if ((vcpu->arch.efer & EFER_LME) && !is_paging(vcpu) &&
> > +	    (cr0 & X86_CR0_PG)) {
> 
> it seems we have more than one occurance of "if (vcpu->arch.efer &
> EFER_LME)" under "#ifdef CONFIG_X86_64" and we alredy have 
> 
> static inline int is_long_mode(struct kvm_vcpu *vcpu)
> {
> #ifdef CONFIG_X86_64
>      return vcpu->arch.efer & EFER_LMA;
> #else
>      return 0;
> #endif
> }
> 
> so if we use this instead, the compilers will just throw away the
> non-reachable blocks when !(#ifdef CONFIG_X86_64), right?

EFER.LME vs. EFER.LMA.  The kvm_set_cr0() check is specifically looking at
the case where EFER.LME=1, EFER.LMA=0, and CR0.PG is being toggled on, i.e.
long mode is being enabled.  EFER_LMA won't be set until vmx_set_cr0() does
enter_lmode().
Vitaly Kuznetsov July 14, 2020, 2:11 p.m. UTC | #3
Sean Christopherson <sean.j.christopherson@intel.com> writes:

> On Tue, Jul 14, 2020 at 02:00:04PM +0200, Vitaly Kuznetsov wrote:
>> Sean Christopherson <sean.j.christopherson@intel.com> writes:
>> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> > index 95ef629228691..5f526d94c33f3 100644
>> > --- a/arch/x86/kvm/x86.c
>> > +++ b/arch/x86/kvm/x86.c
>> > @@ -819,22 +819,22 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
>> >  	if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE))
>> >  		return 1;
>> >  
>> > -	if (cr0 & X86_CR0_PG) {
>> >  #ifdef CONFIG_X86_64
>> > -		if (!is_paging(vcpu) && (vcpu->arch.efer & EFER_LME)) {
>> > -			int cs_db, cs_l;
>> > +	if ((vcpu->arch.efer & EFER_LME) && !is_paging(vcpu) &&
>> > +	    (cr0 & X86_CR0_PG)) {
>> 
>> it seems we have more than one occurance of "if (vcpu->arch.efer &
>> EFER_LME)" under "#ifdef CONFIG_X86_64" and we alredy have 
>> 
>> static inline int is_long_mode(struct kvm_vcpu *vcpu)
>> {
>> #ifdef CONFIG_X86_64
>>      return vcpu->arch.efer & EFER_LMA;
>> #else
>>      return 0;
>> #endif
>> }
>> 
>> so if we use this instead, the compilers will just throw away the
>> non-reachable blocks when !(#ifdef CONFIG_X86_64), right?
>
> EFER.LME vs. EFER.LMA.  The kvm_set_cr0() check is specifically looking at
> the case where EFER.LME=1, EFER.LMA=0, and CR0.PG is being toggled on, i.e.
> long mode is being enabled.  EFER_LMA won't be set until vmx_set_cr0() does
> enter_lmode().

Oops, shame on me :-(

Would it make sense to introduce something like is_long_mode()
(e.g. is_efer_lme()) for LME and #ifdef CONFIG_X86_64? I see the
same checks in vmx_set_cr0()/svm_set_cr0())?
Jim Mattson July 14, 2020, 6:55 p.m. UTC | #4
On Mon, Jul 13, 2020 at 6:57 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Don't attempt to load PDPTRs if EFER.LME=1, i.e. if 64-bit mode is
> enabled.  A recent change to reload the PDTPRs when CR0.CD or CR0.NW is
> toggled botched the EFER.LME handling and sends KVM down the PDTPR path
> when is_paging() is true, i.e. when the guest toggles CD/NW in 64-bit
> mode.

Oops!

I don't think "is_paging()" is relevant here, so much as "EFER.LME=1."
As you note below, KVM *should* go down the PDPTR path when
is_paging() is true and EFER.LME=0.

> Split the CR0 checks for 64-bit vs. 32-bit PAE into separate paths.  The
> 64-bit path is specifically checking state when paging is toggled on,
> i.e. CR0.PG transititions from 0->1.  The PDPTR path now needs to run if
> the new CR0 state has paging enabled, irrespective of whether paging was
> already enabled.  Trying to shave a few cycles to make the PDPTR path an
> "else if" case is a mess.
>
> Fixes: d42e3fae6faed ("kvm: x86: Read PDPTEs on CR0.CD and CR0.NW changes")
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Oliver Upton <oupton@google.com>
> Cc: Peter Shier <pshier@google.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Sean Christopherson July 14, 2020, 6:58 p.m. UTC | #5
On Tue, Jul 14, 2020 at 11:55:45AM -0700, Jim Mattson wrote:
> On Mon, Jul 13, 2020 at 6:57 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > Don't attempt to load PDPTRs if EFER.LME=1, i.e. if 64-bit mode is
> > enabled.  A recent change to reload the PDTPRs when CR0.CD or CR0.NW is
> > toggled botched the EFER.LME handling and sends KVM down the PDTPR path
> > when is_paging() is true, i.e. when the guest toggles CD/NW in 64-bit
> > mode.
> 
> Oops!
> 
> I don't think "is_paging()" is relevant here, so much as "EFER.LME=1."
> As you note below, KVM *should* go down the PDPTR path when
> is_paging() is true and EFER.LME=0.

It's relevant for the EFER.LME=1 case as it's used to detect CR0.PG 0->1.

Though maybe we're in violent agreement?
Jim Mattson July 14, 2020, 7:02 p.m. UTC | #6
On Tue, Jul 14, 2020 at 11:59 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Tue, Jul 14, 2020 at 11:55:45AM -0700, Jim Mattson wrote:
> > On Mon, Jul 13, 2020 at 6:57 PM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> > >
> > > Don't attempt to load PDPTRs if EFER.LME=1, i.e. if 64-bit mode is
> > > enabled.  A recent change to reload the PDTPRs when CR0.CD or CR0.NW is
> > > toggled botched the EFER.LME handling and sends KVM down the PDTPR path
> > > when is_paging() is true, i.e. when the guest toggles CD/NW in 64-bit
> > > mode.
> >
> > Oops!
> >
> > I don't think "is_paging()" is relevant here, so much as "EFER.LME=1."
> > As you note below, KVM *should* go down the PDPTR path when
> > is_paging() is true and EFER.LME=0.
>
> It's relevant for the EFER.LME=1 case as it's used to detect CR0.PG 0->1.
>
> Though maybe we're in violent agreement?

We're in agreement conceptually, but I find your original text lacking
in clarity. :-)
Sean Christopherson Aug. 4, 2020, 6:41 p.m. UTC | #7
On Mon, Jul 13, 2020 at 06:57:32PM -0700, Sean Christopherson wrote:
> Don't attempt to load PDPTRs if EFER.LME=1, i.e. if 64-bit mode is
> enabled.  A recent change to reload the PDTPRs when CR0.CD or CR0.NW is
> toggled botched the EFER.LME handling and sends KVM down the PDTPR path
> when is_paging() is true, i.e. when the guest toggles CD/NW in 64-bit
> mode.
> 
> Split the CR0 checks for 64-bit vs. 32-bit PAE into separate paths.  The
> 64-bit path is specifically checking state when paging is toggled on,
> i.e. CR0.PG transititions from 0->1.  The PDPTR path now needs to run if
> the new CR0 state has paging enabled, irrespective of whether paging was
> already enabled.  Trying to shave a few cycles to make the PDPTR path an
> "else if" case is a mess.
> 
> Fixes: d42e3fae6faed ("kvm: x86: Read PDPTEs on CR0.CD and CR0.NW changes")
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Oliver Upton <oupton@google.com>
> Cc: Peter Shier <pshier@google.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---

Ping.  This really needs to be in the initial pull for 5.9, as is kvm/queue
has a 100% fatality rate for me.
Jim Mattson Aug. 4, 2020, 6:46 p.m. UTC | #8
On Tue, Aug 4, 2020 at 11:41 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:

> Ping.  This really needs to be in the initial pull for 5.9, as is kvm/queue
> has a 100% fatality rate for me.

I agree completely, but I am curious what guest you have that toggles
CD/NW in 64-bit mode.
Sean Christopherson Aug. 4, 2020, 7:09 p.m. UTC | #9
On Tue, Aug 04, 2020 at 11:46:58AM -0700, Jim Mattson wrote:
> On Tue, Aug 4, 2020 at 11:41 AM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> 
> > Ping.  This really needs to be in the initial pull for 5.9, as is kvm/queue
> > has a 100% fatality rate for me.
> 
> I agree completely, but I am curious what guest you have that toggles
> CD/NW in 64-bit mode.

It's my OVMF build.  I assume it's MTRR programming, but I haven't bothered
to debug that far.  Nor do I want to, the EDKII build framework makes my
head explode :-).
Maxim Levitsky Aug. 5, 2020, 7:04 a.m. UTC | #10
On Mon, 2020-07-13 at 18:57 -0700, Sean Christopherson wrote:
> Don't attempt to load PDPTRs if EFER.LME=1, i.e. if 64-bit mode is
> enabled.  A recent change to reload the PDTPRs when CR0.CD or CR0.NW is
> toggled botched the EFER.LME handling and sends KVM down the PDTPR path
> when is_paging() is true, i.e. when the guest toggles CD/NW in 64-bit
> mode.
> 
> Split the CR0 checks for 64-bit vs. 32-bit PAE into separate paths.  The
> 64-bit path is specifically checking state when paging is toggled on,
> i.e. CR0.PG transititions from 0->1.  The PDPTR path now needs to run if
> the new CR0 state has paging enabled, irrespective of whether paging was
> already enabled.  Trying to shave a few cycles to make the PDPTR path an
> "else if" case is a mess.
> 
> Fixes: d42e3fae6faed ("kvm: x86: Read PDPTEs on CR0.CD and CR0.NW changes")
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Oliver Upton <oupton@google.com>
> Cc: Peter Shier <pshier@google.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
> 
> The other way to fix this, with a much smaller diff stat, is to simply
> move the !is_page(vcpu) check inside (vcpu->arch.efer & EFER_LME).  But
> that results in a ridiculous amount of nested conditionals for what is a
> very straightforward check e.g.
> 
> 	if (cr0 & X86_CR0_PG) {
> 		if (vcpu->arch.efer & EFER_LME) }
> 			if (!is_paging(vcpu)) {
> 				...
> 			}
> 		}
> 	}
> 
> Since this doesn't need to be backported anywhere, I didn't see any value
> in having an intermediate step.
> 
>  arch/x86/kvm/x86.c | 24 ++++++++++++------------
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 95ef629228691..5f526d94c33f3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -819,22 +819,22 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
>  	if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE))
>  		return 1;
>  
> -	if (cr0 & X86_CR0_PG) {
>  #ifdef CONFIG_X86_64
> -		if (!is_paging(vcpu) && (vcpu->arch.efer & EFER_LME)) {
> -			int cs_db, cs_l;
> +	if ((vcpu->arch.efer & EFER_LME) && !is_paging(vcpu) &&
> +	    (cr0 & X86_CR0_PG)) {
> +		int cs_db, cs_l;
>  
> -			if (!is_pae(vcpu))
> -				return 1;
> -			kvm_x86_ops.get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
> -			if (cs_l)
> -				return 1;
> -		} else
> -#endif
> -		if (is_pae(vcpu) && ((cr0 ^ old_cr0) & pdptr_bits) &&
> -		    !load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu)))
> +		if (!is_pae(vcpu))
> +			return 1;
> +		kvm_x86_ops.get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
> +		if (cs_l)
>  			return 1;
>  	}
> +#endif
> +	if (!(vcpu->arch.efer & EFER_LME) && (cr0 & X86_CR0_PG) &&
> +	    is_pae(vcpu) && ((cr0 ^ old_cr0) & pdptr_bits) &&
> +	    !load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu)))
> +		return 1;
>  
>  	if (!(cr0 & X86_CR0_PG) && kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE))
>  		return 1;

I also investigated this issue (also same thing, OVMF doesn't boot),
and after looking at the intel and amd's PRM, this looks like correct solution.
I also tested this and it works.


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky
Jim Mattson Aug. 6, 2020, 9:32 p.m. UTC | #11
On Wed, Aug 5, 2020 at 12:04 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
>
> On Mon, 2020-07-13 at 18:57 -0700, Sean Christopherson wrote:
> > Don't attempt to load PDPTRs if EFER.LME=1, i.e. if 64-bit mode is
> > enabled.  A recent change to reload the PDTPRs when CR0.CD or CR0.NW is
> > toggled botched the EFER.LME handling and sends KVM down the PDTPR path
> > when is_paging() is true, i.e. when the guest toggles CD/NW in 64-bit
> > mode.
> >
> > Split the CR0 checks for 64-bit vs. 32-bit PAE into separate paths.  The
> > 64-bit path is specifically checking state when paging is toggled on,
> > i.e. CR0.PG transititions from 0->1.  The PDPTR path now needs to run if
> > the new CR0 state has paging enabled, irrespective of whether paging was
> > already enabled.  Trying to shave a few cycles to make the PDPTR path an
> > "else if" case is a mess.
> >
> > Fixes: d42e3fae6faed ("kvm: x86: Read PDPTEs on CR0.CD and CR0.NW changes")
> > Cc: Jim Mattson <jmattson@google.com>
> > Cc: Oliver Upton <oupton@google.com>
> > Cc: Peter Shier <pshier@google.com>
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> > ---
> >
> > The other way to fix this, with a much smaller diff stat, is to simply
> > move the !is_page(vcpu) check inside (vcpu->arch.efer & EFER_LME).  But
> > that results in a ridiculous amount of nested conditionals for what is a
> > very straightforward check e.g.
> >
> >       if (cr0 & X86_CR0_PG) {
> >               if (vcpu->arch.efer & EFER_LME) }
> >                       if (!is_paging(vcpu)) {
> >                               ...
> >                       }
> >               }
> >       }
> >
> > Since this doesn't need to be backported anywhere, I didn't see any value
> > in having an intermediate step.
> >
> >  arch/x86/kvm/x86.c | 24 ++++++++++++------------
> >  1 file changed, 12 insertions(+), 12 deletions(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 95ef629228691..5f526d94c33f3 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -819,22 +819,22 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
> >       if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE))
> >               return 1;
> >
> > -     if (cr0 & X86_CR0_PG) {
> >  #ifdef CONFIG_X86_64
> > -             if (!is_paging(vcpu) && (vcpu->arch.efer & EFER_LME)) {
> > -                     int cs_db, cs_l;
> > +     if ((vcpu->arch.efer & EFER_LME) && !is_paging(vcpu) &&
> > +         (cr0 & X86_CR0_PG)) {
> > +             int cs_db, cs_l;
> >
> > -                     if (!is_pae(vcpu))
> > -                             return 1;
> > -                     kvm_x86_ops.get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
> > -                     if (cs_l)
> > -                             return 1;
> > -             } else
> > -#endif
> > -             if (is_pae(vcpu) && ((cr0 ^ old_cr0) & pdptr_bits) &&
> > -                 !load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu)))
> > +             if (!is_pae(vcpu))
> > +                     return 1;
> > +             kvm_x86_ops.get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
> > +             if (cs_l)
> >                       return 1;
> >       }
> > +#endif
> > +     if (!(vcpu->arch.efer & EFER_LME) && (cr0 & X86_CR0_PG) &&
> > +         is_pae(vcpu) && ((cr0 ^ old_cr0) & pdptr_bits) &&
> > +         !load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu)))
> > +             return 1;

It might be worth commenting on the subtlety of the test below being
skipped if the PDPTEs were loaded above. I'm assuming that the PDPTEs
shouldn't be loaded if the instruction faults.

> >       if (!(cr0 & X86_CR0_PG) && kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE))
> >               return 1;
>
> I also investigated this issue (also same thing, OVMF doesn't boot),
> and after looking at the intel and amd's PRM, this looks like correct solution.
> I also tested this and it works.
>
>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
>
> Best regards,
>         Maxim Levitsky
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 95ef629228691..5f526d94c33f3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -819,22 +819,22 @@  int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 	if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE))
 		return 1;
 
-	if (cr0 & X86_CR0_PG) {
 #ifdef CONFIG_X86_64
-		if (!is_paging(vcpu) && (vcpu->arch.efer & EFER_LME)) {
-			int cs_db, cs_l;
+	if ((vcpu->arch.efer & EFER_LME) && !is_paging(vcpu) &&
+	    (cr0 & X86_CR0_PG)) {
+		int cs_db, cs_l;
 
-			if (!is_pae(vcpu))
-				return 1;
-			kvm_x86_ops.get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
-			if (cs_l)
-				return 1;
-		} else
-#endif
-		if (is_pae(vcpu) && ((cr0 ^ old_cr0) & pdptr_bits) &&
-		    !load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu)))
+		if (!is_pae(vcpu))
+			return 1;
+		kvm_x86_ops.get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
+		if (cs_l)
 			return 1;
 	}
+#endif
+	if (!(vcpu->arch.efer & EFER_LME) && (cr0 & X86_CR0_PG) &&
+	    is_pae(vcpu) && ((cr0 ^ old_cr0) & pdptr_bits) &&
+	    !load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu)))
+		return 1;
 
 	if (!(cr0 & X86_CR0_PG) && kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE))
 		return 1;