diff mbox series

[35/43] KVM: x86: Move setting of sregs during vCPU RESET/INIT to common x86

Message ID 20210424004645.3950558-36-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: vCPU RESET/INIT fixes and consolidation | expand

Commit Message

Sean Christopherson April 24, 2021, 12:46 a.m. UTC
Move the setting of CR0, CR4, EFER, RFLAGS, and RIP from vendor code to
common x86.  VMX and SVM now have near-identical sequences, the only
difference between that VMX updates the exception bitmap.  Updating the
bitmap on SVM is unnecessary, but benign.  Unfortunately it can't be left
behind in VMX due to the need to update exception intercepts after the
control registers are set.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 6 ------
 arch/x86/kvm/vmx/vmx.c | 9 ---------
 arch/x86/kvm/x86.c     | 8 ++++++++
 3 files changed, 8 insertions(+), 15 deletions(-)

Comments

Reiji Watanabe May 17, 2021, 11:50 p.m. UTC | #1
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1204,12 +1204,6 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
>         init_sys_seg(&save->ldtr, SEG_TYPE_LDT);
>         init_sys_seg(&save->tr, SEG_TYPE_BUSY_TSS16);
>
> -       svm_set_cr0(vcpu, X86_CR0_NW | X86_CR0_CD | X86_CR0_ET);
> -       svm_set_cr4(vcpu, 0);
> -       svm_set_efer(vcpu, 0);
> -       kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
> -       vcpu->arch.regs[VCPU_REGS_RIP] = 0x0000fff0;

Reviewed-by: Reiji Watanabe <reijiw@google.com>

Those your vCPU RESET/INIT changes look great.

I think the change in init_vmcb() basically assumes that the
function is called from kvm_vcpu_reset(via svm_vcpu_reset()).
Although shutdown_interception() directly calls init_mcb(),
I would think the change doesn't matter for the shutdown
interception case.

IMHO it would be a bit misleading that a function named 'init_vmcb',
which is called from other than kvm_vcpu_reset (svm_vcpu_reset()),
only partially resets the vmcb (probably just to me though).
So, I personally think it would be better if its name or comment
can give some more specific information about the assumption.

BTW, it looks like two lines of "vcpu->arch.hflags = 0;"
can be also removed from the init_vmcb() as well.

Thanks,
Reiji
Sean Christopherson May 18, 2021, 9:45 p.m. UTC | #2
On Mon, May 17, 2021, Reiji Watanabe wrote:
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -1204,12 +1204,6 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
> >         init_sys_seg(&save->ldtr, SEG_TYPE_LDT);
> >         init_sys_seg(&save->tr, SEG_TYPE_BUSY_TSS16);
> >
> > -       svm_set_cr0(vcpu, X86_CR0_NW | X86_CR0_CD | X86_CR0_ET);
> > -       svm_set_cr4(vcpu, 0);
> > -       svm_set_efer(vcpu, 0);
> > -       kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
> > -       vcpu->arch.regs[VCPU_REGS_RIP] = 0x0000fff0;
> 
> Reviewed-by: Reiji Watanabe <reijiw@google.com>
> 
> Those your vCPU RESET/INIT changes look great.
> 
> I think the change in init_vmcb() basically assumes that the
> function is called from kvm_vcpu_reset(via svm_vcpu_reset()).
> Although shutdown_interception() directly calls init_mcb(),
> I would think the change doesn't matter for the shutdown
> interception case.
> 
> IMHO it would be a bit misleading that a function named 'init_vmcb',
> which is called from other than kvm_vcpu_reset (svm_vcpu_reset()),
> only partially resets the vmcb (probably just to me though).

It's not just you, that code is funky.  If I could go back in time, I would lobby
to not automatically init the VMCB and instead put the vCPU into
KVM_MP_STATE_UNINITIALIZED and force userspace to explicitly INIT or RESET the
vCPU.  Even better would be to add KVM_MP_STATE_SHUTDOWN, since technically NMI
can break shutdown (and SMI on Intel CPUs).

Anyways, that ship has sailed, but we might be able to get away with replacing
init_vmcb() with kvm_vcpu_reset(vcpu, true), i.e. effecting a full INIT.  That
would ensure the VMCB is consistent with KVM's software model, which I'm guessing
is not true with the direct init_vmcb() call.  It would also have some connection
to reality since there exist bare metal platforms that automatically INIT the CPU
if it hits shutdown (maybe only for the BSP?).

Side topic, the NMI thing got me looking through init_vmcb() to see how it
handles the IDT and GDT, and surprise, surprise, it fails to zero IDTR.base and
GDTR.base.  I'll add a patch to fix that, and maybe try to consolidate the VMX
and SVM segmentation logic.

> So, I personally think it would be better if its name or comment
> can give some more specific information about the assumption.
> 
> BTW, it looks like two lines of "vcpu->arch.hflags = 0;"
> can be also removed from the init_vmcb() as well.

Ya, I'll add that to the pile.

Thanks!
Reiji Watanabe May 21, 2021, 5:19 a.m. UTC | #3
On Tue, May 18, 2021 at 2:45 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, May 17, 2021, Reiji Watanabe wrote:
> > > --- a/arch/x86/kvm/svm/svm.c
> > > +++ b/arch/x86/kvm/svm/svm.c
> > > @@ -1204,12 +1204,6 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
> > >         init_sys_seg(&save->ldtr, SEG_TYPE_LDT);
> > >         init_sys_seg(&save->tr, SEG_TYPE_BUSY_TSS16);
> > >
> > > -       svm_set_cr0(vcpu, X86_CR0_NW | X86_CR0_CD | X86_CR0_ET);
> > > -       svm_set_cr4(vcpu, 0);
> > > -       svm_set_efer(vcpu, 0);
> > > -       kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
> > > -       vcpu->arch.regs[VCPU_REGS_RIP] = 0x0000fff0;
> >
> > Reviewed-by: Reiji Watanabe <reijiw@google.com>
> >
> > Those your vCPU RESET/INIT changes look great.
> >
> > I think the change in init_vmcb() basically assumes that the
> > function is called from kvm_vcpu_reset(via svm_vcpu_reset()).
> > Although shutdown_interception() directly calls init_mcb(),
> > I would think the change doesn't matter for the shutdown
> > interception case.
> >
> > IMHO it would be a bit misleading that a function named 'init_vmcb',
> > which is called from other than kvm_vcpu_reset (svm_vcpu_reset()),
> > only partially resets the vmcb (probably just to me though).
>
> It's not just you, that code is funky.  If I could go back in time, I would lobby
> to not automatically init the VMCB and instead put the vCPU into
> KVM_MP_STATE_UNINITIALIZED and force userspace to explicitly INIT or RESET the
> vCPU.  Even better would be to add KVM_MP_STATE_SHUTDOWN, since technically NMI
> can break shutdown (and SMI on Intel CPUs).

I see.  Adding KVM_MP_STATE_SHUTDOWN sounds right to me
given the real CPU's behavior.

> Anyways, that ship has sailed, but we might be able to get away with replacing
> init_vmcb() with kvm_vcpu_reset(vcpu, true), i.e. effecting a full INIT.  That
> would ensure the VMCB is consistent with KVM's software model, which I'm guessing
> is not true with the direct init_vmcb() call.  It would also have some connection
> to reality since there exist bare metal platforms that automatically INIT the CPU
> if it hits shutdown (maybe only for the BSP?).

Yes, calling kvm_vcpu_reset(vcpu, true) sounds better than
the direct init_vmcb() call.


> Side topic, the NMI thing got me looking through init_vmcb() to see how it
> handles the IDT and GDT, and surprise, surprise, it fails to zero IDTR.base and
> GDTR.base.  I'll add a patch to fix that, and maybe try to consolidate the VMX
> and SVM segmentation logic.

That's surprising...
It seems init_vmcb() was used only for RESET when the function was
originally introduced and the entire vmcb was zero-cleared before
init_vmcb() was called.  So, I would suspect init_vmcb() was originally
implemented assuming that all the vmcb fields were zero.

 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6aa8b732ca01c3d7a54e93f4d701b8aabbe60fb7

Thanks,
Reiji
diff mbox series

Patch

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 996a6b03e338..23f880268ff5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1204,12 +1204,6 @@  static void init_vmcb(struct kvm_vcpu *vcpu)
 	init_sys_seg(&save->ldtr, SEG_TYPE_LDT);
 	init_sys_seg(&save->tr, SEG_TYPE_BUSY_TSS16);
 
-	svm_set_cr0(vcpu, X86_CR0_NW | X86_CR0_CD | X86_CR0_ET);
-	svm_set_cr4(vcpu, 0);
-	svm_set_efer(vcpu, 0);
-	kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
-	vcpu->arch.regs[VCPU_REGS_RIP] = 0x0000fff0;
-
 	if (npt_enabled) {
 		/* Setup VMCB for Nested Paging */
 		control->nested_ctl |= SVM_NESTED_CTL_NP_ENABLE;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8c982e049cbb..d8afca144e11 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4557,9 +4557,6 @@  static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 		vmcs_write64(GUEST_IA32_DEBUGCTL, 0);
 	}
 
-	kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
-	kvm_rip_write(vcpu, 0xfff0);
-
 	vmcs_writel(GUEST_GDTR_BASE, 0);
 	vmcs_write32(GUEST_GDTR_LIMIT, 0xffff);
 
@@ -4587,12 +4584,6 @@  static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 
 	kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
 
-	vmx_set_cr0(vcpu, X86_CR0_NW | X86_CR0_CD | X86_CR0_ET);
-	vmx_set_cr4(vcpu, 0);
-	vmx_set_efer(vcpu, 0);
-
-	vmx_update_exception_bitmap(vcpu);
-
 	vpid_sync_context(vmx->vpid);
 	if (init_event)
 		vmx_clear_hlt(vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 167c650d1187..97d8e3e74bab 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10499,6 +10499,14 @@  void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 
 	static_call(kvm_x86_vcpu_reset)(vcpu, init_event);
 
+	kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
+	kvm_rip_write(vcpu, 0xfff0);
+
+	static_call(kvm_x86_set_cr0)(vcpu, X86_CR0_NW | X86_CR0_CD | X86_CR0_ET);
+	static_call(kvm_x86_set_cr4)(vcpu, 0);
+	static_call(kvm_x86_set_efer)(vcpu, 0);
+	static_call(kvm_x86_update_exception_bitmap)(vcpu);
+
 	if (kvm_cr0_mmu_role_changed(old_cr0, kvm_read_cr0(vcpu)) ||
 	    kvm_cr4_mmu_role_changed(old_cr4, kvm_read_cr4(vcpu)))
 		kvm_mmu_reset_context(vcpu);