diff mbox series

[v8,13/15] x86/split_lock: Enable split lock detection by default

Message ID 1556134382-58814-14-git-send-email-fenghua.yu@intel.com (mailing list archive)
State Not Applicable
Delegated to: Johannes Berg
Headers show
Series x86/split_lock: Enable split lock detection | expand

Commit Message

Fenghua Yu April 24, 2019, 7:33 p.m. UTC
A split locked access locks bus and degrades overall memory access
performance. When split lock detection feature is enumerated, enable
the feature by default by writing 1 to bit 29 in MSR TEST_CTL to find
any split lock issue.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
 arch/x86/kernel/cpu/intel.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Thomas Gleixner April 25, 2019, 7:50 a.m. UTC | #1
On Wed, 24 Apr 2019, Fenghua Yu wrote:
>  
> +static void split_lock_update_msr(void)
> +{
> +	/* Enable split lock detection */
> +	msr_set_bit(MSR_TEST_CTL, TEST_CTL_SPLIT_LOCK_DETECT_SHIFT);
> +	this_cpu_or(msr_test_ctl_cache, TEST_CTL_SPLIT_LOCK_DETECT);

I'm pretty sure, that I told you to utilize the cache proper. Again:

> > Nothing in this file initializes msr_test_ctl_cache explicitely. Register
> > caching always requires to read the register and store it in the cache
> > before doing anything with it. Nothing guarantees that all bits in that MSR
> > are 0 by default forever.
> >
> > And once you do that _before_ calling split_lock_update_msr() then you can
> > spare the RMW in that function.

So you managed to fix the initializaiton part, but then you still do a
pointless RMW.

Thanks,

	tglx
David Laight April 25, 2019, 10:52 a.m. UTC | #2
From:  Fenghua Yu
> Sent: 24 April 2019 20:33
> A split locked access locks bus and degrades overall memory access
> performance. When split lock detection feature is enumerated, enable
> the feature by default by writing 1 to bit 29 in MSR TEST_CTL to find
> any split lock issue.

You can't enable this by default until ALL the known potentially
misaligned locked memory operations have been fixed.

AFAICT you've only fixed the ones that have actually faulted.
You've not even fixed the other ones in the same source files
as ones you've 'fixed'.

You need to do a code audit of all the in-tree kernel code
that uses locked operations - especially the 'bit' ones
since a lot of code casts the bitmap address - so it probably
isn't long aligned.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Thomas Gleixner April 25, 2019, 10:58 a.m. UTC | #3
On Thu, 25 Apr 2019, David Laight wrote:

> From:  Fenghua Yu
> > Sent: 24 April 2019 20:33
> > A split locked access locks bus and degrades overall memory access
> > performance. When split lock detection feature is enumerated, enable
> > the feature by default by writing 1 to bit 29 in MSR TEST_CTL to find
> > any split lock issue.
> 
> You can't enable this by default until ALL the known potentially
> misaligned locked memory operations have been fixed.

Errm? The result will be a WARN_ON() printed and no further damage. It's
not making anything worse than it is now. In fact we just should add a

    WARN_ON_ONCE(!aligned_to_long(p)) to all the xxx_bit() operations.

so we catch them even when they do not trigger that #AC thingy.

Thanks,

	tglx
David Laight April 25, 2019, 11:13 a.m. UTC | #4
From: Thomas Gleixne]
> Sent: 25 April 2019 11:59
> On Thu, 25 Apr 2019, David Laight wrote:
> 
> > From:  Fenghua Yu
> > > Sent: 24 April 2019 20:33
> > > A split locked access locks bus and degrades overall memory access
> > > performance. When split lock detection feature is enumerated, enable
> > > the feature by default by writing 1 to bit 29 in MSR TEST_CTL to find
> > > any split lock issue.
> >
> > You can't enable this by default until ALL the known potentially
> > misaligned locked memory operations have been fixed.
> 
> Errm? The result will be a WARN_ON() printed and no further damage.

ISTR something about sending SIGSEGV to userspace.

> It's not making anything worse than it is now. In fact we just should add a
> 
>     WARN_ON_ONCE(!aligned_to_long(p)) to all the xxx_bit() operations.
> 
> so we catch them even when they do not trigger that #AC thingy.

That will explode the kernel code size.
In any case some of the items I found in a quick scan were bss/data
so the alignment will vary from build to build.

I also found some casts on the xxx_bit() functions in generic code.
I didn't look to see how badly wrong they go on BE systems.

While the x86 xxx_bit() functions could easily be changed to do
32bit accesses, the 'misaligned' operations will affect all
architectures - and may have different effects on others.

I'm not at all sure that 'compare and exchange' operations
are atomic on all cpus if the data is misaligned and crosses
a page boundary and either (or both) pages need faulting in
(or hit a TLB miss).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Peter Zijlstra April 25, 2019, 11:41 a.m. UTC | #5
On Thu, Apr 25, 2019 at 11:13:23AM +0000, David Laight wrote:
> I'm not at all sure that 'compare and exchange' operations
> are atomic on all cpus if the data is misaligned and crosses
> a page boundary and either (or both) pages need faulting in
> (or hit a TLB miss).

Most sane architectures already trap if you attempt that.
Thomas Gleixner April 25, 2019, 1:04 p.m. UTC | #6
On Thu, 25 Apr 2019, David Laight wrote:
> From: Thomas Gleixne]
> > Sent: 25 April 2019 11:59
> > On Thu, 25 Apr 2019, David Laight wrote:
> > 
> > > From:  Fenghua Yu
> > > > Sent: 24 April 2019 20:33
> > > > A split locked access locks bus and degrades overall memory access
> > > > performance. When split lock detection feature is enumerated, enable
> > > > the feature by default by writing 1 to bit 29 in MSR TEST_CTL to find
> > > > any split lock issue.
> > >
> > > You can't enable this by default until ALL the known potentially
> > > misaligned locked memory operations have been fixed.
> > 
> > Errm? The result will be a WARN_ON() printed and no further damage.
> 
> ISTR something about sending SIGSEGV to userspace.

Nope. If the #AC originates from kernel then a warning is printed and the
detection is disabled.

Userspace is a different story. We cannot audit all user space upfront,
right?

> > It's not making anything worse than it is now. In fact we just should add a
> > 
> >     WARN_ON_ONCE(!aligned_to_long(p)) to all the xxx_bit() operations.
> > 
> > so we catch them even when they do not trigger that #AC thingy.
> 
> That will explode the kernel code size.

So what? We have enough debug options which make the kernel big. One more
does not really matter.

Thanks,

	tglx
Fenghua Yu May 6, 2019, 9:39 p.m. UTC | #7
On Thu, Apr 25, 2019 at 09:50:20AM +0200, Thomas Gleixner wrote:
> On Wed, 24 Apr 2019, Fenghua Yu wrote:
> >  
> > +static void split_lock_update_msr(void)
> > +{
> > +	/* Enable split lock detection */
> > +	msr_set_bit(MSR_TEST_CTL, TEST_CTL_SPLIT_LOCK_DETECT_SHIFT);
> > +	this_cpu_or(msr_test_ctl_cache, TEST_CTL_SPLIT_LOCK_DETECT);
> 
> I'm pretty sure, that I told you to utilize the cache proper. Again:
> 
> > > Nothing in this file initializes msr_test_ctl_cache explicitely. Register
> > > caching always requires to read the register and store it in the cache
> > > before doing anything with it. Nothing guarantees that all bits in that MSR
> > > are 0 by default forever.
> > >
> > > And once you do that _before_ calling split_lock_update_msr() then you can
> > > spare the RMW in that function.
> 
> So you managed to fix the initializaiton part, but then you still do a
> pointless RMW.

Ok. I see. msr_set_bit() is a RMW operation.

So is the following the right code to update msr and cache variable?

+static void split_lock_update_msr(void)
+{
+   /* Enable split lock detection */
+   this_cpu_or(msr_test_ctl_cache, TEST_CTL_SPLIT_LOCK_DETECT);
+   wrmsrl(MSR_TEST_CTL, msr_test_ctl_cache);

Thanks.

-Fenghua
Fenghua Yu May 7, 2019, 8:48 p.m. UTC | #8
On Thu, Apr 25, 2019 at 12:58:32PM +0200, Thomas Gleixner wrote:
> On Thu, 25 Apr 2019, David Laight wrote:
> 
> > From:  Fenghua Yu
> > > Sent: 24 April 2019 20:33
> > > A split locked access locks bus and degrades overall memory access
> > > performance. When split lock detection feature is enumerated, enable
> > > the feature by default by writing 1 to bit 29 in MSR TEST_CTL to find
> > > any split lock issue.
> > 
> > You can't enable this by default until ALL the known potentially
> > misaligned locked memory operations have been fixed.
> 
> Errm? The result will be a WARN_ON() printed and no further damage. It's
> not making anything worse than it is now. In fact we just should add a
> 
>     WARN_ON_ONCE(!aligned_to_long(p)) to all the xxx_bit() operations.
> 
> so we catch them even when they do not trigger that #AC thingy.

I add WARN_ON_ONCE() in atomic xxx_bit(). But the code cannot be compiled.

Here is a simplified patch (only adding warning in set_bit()):

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index 8e790ec219a5..bc889ac12e26 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -14,6 +14,8 @@
 #endif

 #include <linux/compiler.h>
+#include <linux/kernel.h>
+#include <asm-generic/bug.h>
 #include <asm/alternative.h>
 #include <asm/rmwcc.h>
 #include <asm/barrier.h>
@@ -67,6 +69,8 @@
 static __always_inline void
 set_bit(long nr, volatile unsigned long *addr)
 {
+       WARN_ON_ONCE(!IS_ALIGNED((unsigned long)addr, sizeof(unsigned long)));
+
        if (IS_IMMEDIATE(nr)) {
                asm volatile(LOCK_PREFIX "orb %1,%0"
                        : CONST_MASK_ADDR(nr, addr)

gcc reports errors:
  CC      kernel/bounds.s
  CALL    scripts/atomic/check-atomics.sh
In file included from ./include/linux/bitops.h:19,
                 from ./include/linux/kernel.h:12,
                 from ./include/asm-generic/bug.h:18,
                 from ./arch/x86/include/asm/bug.h:83,
                 from ./include/linux/bug.h:5,
                 from ./include/linux/page-flags.h:10,
                 from kernel/bounds.c:10:
./arch/x86/include/asm/bitops.h: In function ‘set_bit’:
./arch/x86/include/asm/bitops.h:72:2: error: implicit declaration of function ‘WARN_ON_ONCE’; did you mean ‘WRITE_ONCE’? [-Werror=implicit-function-declaration]
  WARN_ON_ONCE(!IS_ALIGNED((unsigned long)addr, sizeof(unsigned long)));
  ^~~~~~~~~~~~
./arch/x86/include/asm/bitops.h:72:16: error: implicit declaration of function ‘IS_ALIGNED’; did you mean ‘IS_ENABLED’? [-Werror=implicit-function-declaration]
  WARN_ON_ONCE(!IS_ALIGNED((unsigned long)addr, sizeof(unsigned long)));
                ^~~~~~~~~~
I think it's because arch/x86/include/asm/bitops.h is included in
include/linux/kernel.h before IS_ALIGNED() is defined and in
include/asm-generic/bug.h before WARN_ON_ONCE() is defined.

How to write a right warn patch and solve the compilation issue?

Thanks.

-Fenghua
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 2cc69217ca7c..28cc6891ba48 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -657,6 +657,13 @@  static void init_intel_misc_features(struct cpuinfo_x86 *c)
 	wrmsrl(MSR_MISC_FEATURES_ENABLES, msr);
 }
 
+static void split_lock_update_msr(void)
+{
+	/* Enable split lock detection */
+	msr_set_bit(MSR_TEST_CTL, TEST_CTL_SPLIT_LOCK_DETECT_SHIFT);
+	this_cpu_or(msr_test_ctl_cache, TEST_CTL_SPLIT_LOCK_DETECT);
+}
+
 static void init_split_lock_detect(struct cpuinfo_x86 *c)
 {
 	if (cpu_has(c, X86_FEATURE_SPLIT_LOCK_DETECT)) {
@@ -665,6 +672,8 @@  static void init_split_lock_detect(struct cpuinfo_x86 *c)
 		/* Cache MSR TEST_CTL */
 		rdmsrl(MSR_TEST_CTL, test_ctl_val);
 		this_cpu_write(msr_test_ctl_cache, test_ctl_val);
+
+		split_lock_update_msr();
 	}
 }
 
@@ -1045,9 +1054,13 @@  static const struct cpu_dev intel_cpu_dev = {
 
 cpu_dev_register(intel_cpu_dev);
 
+#undef pr_fmt
+#define pr_fmt(fmt) "x86/split lock detection: " fmt
+
 static void __init set_split_lock_detect(void)
 {
 	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
+	pr_info("enabled\n");
 }
 
 void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c)