diff mbox series

[01/17] x86: spec_ctrl: fix SPEC_CTRL initialization after kexec

Message ID 20190920212509.2578-2-aarcange@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM monolithic v1 | expand

Commit Message

Andrea Arcangeli Sept. 20, 2019, 9:24 p.m. UTC
We can't assume the SPEC_CTRL msr is zero at boot because it could be
left enabled by a previous kernel booted with
spec_store_bypass_disable=on.

Without this fix a boot with spec_store_bypass_disable=on followed by
a kexec boot with spec_store_bypass_disable=off would erroneously and
unexpectedly leave bit 2 set in SPEC_CTRL.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/include/asm/msr-index.h |  2 ++
 arch/x86/kernel/cpu/bugs.c       | 20 +++++++++++++++++++-
 2 files changed, 21 insertions(+), 1 deletion(-)

Comments

Paolo Bonzini Sept. 23, 2019, 10:22 a.m. UTC | #1
On 20/09/19 23:24, Andrea Arcangeli wrote:
> We can't assume the SPEC_CTRL msr is zero at boot because it could be
> left enabled by a previous kernel booted with
> spec_store_bypass_disable=on.
> 
> Without this fix a boot with spec_store_bypass_disable=on followed by
> a kexec boot with spec_store_bypass_disable=off would erroneously and
> unexpectedly leave bit 2 set in SPEC_CTRL.
> 
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>

Can you send this out separately, so that Thomas et al. can pick it up
as a bug fix?

Thanks,

Paolo
Sean Christopherson Sept. 23, 2019, 3:30 p.m. UTC | #2
On Mon, Sep 23, 2019 at 12:22:23PM +0200, Paolo Bonzini wrote:
> On 20/09/19 23:24, Andrea Arcangeli wrote:
> > We can't assume the SPEC_CTRL msr is zero at boot because it could be
> > left enabled by a previous kernel booted with
> > spec_store_bypass_disable=on.
> > 
> > Without this fix a boot with spec_store_bypass_disable=on followed by
> > a kexec boot with spec_store_bypass_disable=off would erroneously and
> > unexpectedly leave bit 2 set in SPEC_CTRL.
> > 
> > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> 
> Can you send this out separately, so that Thomas et al. can pick it up
> as a bug fix?

Can all off the patches that are not directly related to the monolithic
conversion be sent separately?  AFAICT, patches 01, 03, 07, 08, 14, 15, 16
and 17 are not required or dependent on the conversion to a monolithic
module.  That's almost half the series...
Andrea Arcangeli Sept. 23, 2019, 5:34 p.m. UTC | #3
Hello,

On Mon, Sep 23, 2019 at 08:30:57AM -0700, Sean Christopherson wrote:
> On Mon, Sep 23, 2019 at 12:22:23PM +0200, Paolo Bonzini wrote:
> > On 20/09/19 23:24, Andrea Arcangeli wrote:
> > > We can't assume the SPEC_CTRL msr is zero at boot because it could be
> > > left enabled by a previous kernel booted with
> > > spec_store_bypass_disable=on.
> > > 
> > > Without this fix a boot with spec_store_bypass_disable=on followed by
> > > a kexec boot with spec_store_bypass_disable=off would erroneously and
> > > unexpectedly leave bit 2 set in SPEC_CTRL.
> > > 
> > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> > 
> > Can you send this out separately, so that Thomas et al. can pick it up
> > as a bug fix?

As specified in the cover letter 1/17 was already intended to be
merged separately. I just keep this included in case people had the
idea of using kexec to benchmark this work, because I was bitten by
that bug myself and it wasted a few days worth of benchmarks.

> Can all off the patches that are not directly related to the monolithic
> conversion be sent separately?  AFAICT, patches 01, 03, 07, 08, 14, 15, 16
> and 17 are not required or dependent on the conversion to a monolithic
> module.  That's almost half the series...

03 07 08 are directly related to the monolithic conversion as the
subject of the patch clarifies. In fact I should try to reorder 7/8 in
front to make things more bisectable under all config options.

Per subject of the patch, 14 is also an optimization that while not a
strict requirement, is somewhat related to the monolithic conversion
because in fact it may naturally disappear if I rename the vmx/svm
functions directly.

15 16 17 don't have the monolithic tag in the subject of the patch and
they're obviously unrelated to the monolithic conversion, but when I
did the first research on this idea of dropping kvm.ko a couple of
months ago, things didn't really work well until I got rid of those
few last retpolines too. If felt as if the large retpoline regression
wasn't linear with the number of retpolines executed for each vmexit,
and that it was more linear with the percentage of vmexits that hit on
any number of retpolines. So while they're not part of the monolithic
conversion I assumed they're required to run any meaningful benchmark.

I can drop 15 16 17 from further submits of course, after clarifying
benchmark should be only run on the v1 full set I posted earlier, or
they wouldn't be meaningful.

Thanks,
Andrea
Sean Christopherson Sept. 23, 2019, 10:27 p.m. UTC | #4
On Mon, Sep 23, 2019 at 01:34:21PM -0400, Andrea Arcangeli wrote:
> Per subject of the patch, 14 is also an optimization that while not a
> strict requirement, is somewhat related to the monolithic conversion
> because in fact it may naturally disappear if I rename the vmx/svm
> functions directly.
> 
> 15 16 17 don't have the monolithic tag in the subject of the patch and
> they're obviously unrelated to the monolithic conversion, but when I
> did the first research on this idea of dropping kvm.ko a couple of
> months ago, things didn't really work well until I got rid of those
> few last retpolines too. If felt as if the large retpoline regression
> wasn't linear with the number of retpolines executed for each vmexit,
> and that it was more linear with the percentage of vmexits that hit on
> any number of retpolines. So while they're not part of the monolithic
> conversion I assumed they're required to run any meaningful benchmark.
> 
> I can drop 15 16 17 from further submits of course, after clarifying
> benchmark should be only run on the v1 full set I posted earlier, or
> they wouldn't be meaningful.

I like the patches, I'd just prefer that they be sent in a separate
series so they can churn independently.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 20ce682a2540..3ba95728a6fe 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -47,6 +47,8 @@ 
 #define SPEC_CTRL_STIBP			BIT(SPEC_CTRL_STIBP_SHIFT)	/* STIBP mask */
 #define SPEC_CTRL_SSBD_SHIFT		2	   /* Speculative Store Bypass Disable bit */
 #define SPEC_CTRL_SSBD			BIT(SPEC_CTRL_SSBD_SHIFT)	/* Speculative Store Bypass Disable */
+#define SPEC_CTRL_ALL			(SPEC_CTRL_IBRS|SPEC_CTRL_STIBP| \
+					 SPEC_CTRL_SSBD) /* all known bits */
 
 #define MSR_IA32_PRED_CMD		0x00000049 /* Prediction Command */
 #define PRED_CMD_IBPB			BIT(0)	   /* Indirect Branch Prediction Barrier */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 91c2561b905f..e3922dcf252f 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -92,8 +92,26 @@  void __init check_bugs(void)
 	 * have unknown values. AMD64_LS_CFG MSR is cached in the early AMD
 	 * init code as it is not enumerated and depends on the family.
 	 */
-	if (boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL))
+	if (boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
 		rdmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+		/*
+		 * Clear the non reserved bits from x86_spec_ctrl_base
+		 * to fix kexec. Otherwise for example SSBD could be
+		 * left enabled despite booting with
+		 * spec_store_bypass_disable=off because SSBD would be
+		 * erroneously mistaken as a reserved bit set by the
+		 * BIOS when in fact it was set by a previous kernel
+		 * booted with spec_store_bypass_disable=on. Careful
+		 * however not to write SPEC_CTRL unnecessarily to
+		 * keep the virt MSR intercept enabled as long as
+		 * possible.
+		 */
+		if (x86_spec_ctrl_base & SPEC_CTRL_ALL) {
+			/* all known bits must not be set at boot, clear it */
+			x86_spec_ctrl_base &= ~SPEC_CTRL_ALL;
+			wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+		}
+	}
 
 	/* Allow STIBP in MSR_SPEC_CTRL if supported */
 	if (boot_cpu_has(X86_FEATURE_STIBP))