diff mbox series

[v6,04/20] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

Message ID 1554326526-172295-5-git-send-email-fenghua.yu@intel.com (mailing list archive)
State Not Applicable
Delegated to: Johannes Berg
Headers show
Series x86/split_lock: Enable split locked accesses detection | expand

Commit Message

Fenghua Yu April 3, 2019, 9:21 p.m. UTC
set_cpu_cap() calls locked BTS and clear_cpu_cap() calls locked BTR to
operate on bitmap defined in x86_capability.

Locked BTS/BTR accesses a single unsigned long location. In 64-bit mode,
the location is at:
base address of x86_capability + (bit offset in x86_capability / 64) * 8

Since base address of x86_capability may not be aligned to unsigned long,
the single unsigned long location may cross two cache lines and
accessing the location by locked BTS/BTR introductions will trigger #AC.

To fix the split lock issue, align x86_capability to unsigned long so that
the location will be always within one cache line.

Changing x86_capability[]'s type to unsigned long may also fix the issue
because x86_capability[] will be naturally aligned to unsigned long. But
this needs additional code changes. So choose the simpler solution
by setting alignment to size of unsigned long.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
 arch/x86/include/asm/processor.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

David Laight April 4, 2019, 2:44 p.m. UTC | #1
From: Fenghua Yu
> Sent: 03 April 2019 22:22
> set_cpu_cap() calls locked BTS and clear_cpu_cap() calls locked BTR to
> operate on bitmap defined in x86_capability.
> 
> Locked BTS/BTR accesses a single unsigned long location. In 64-bit mode,
> the location is at:
> base address of x86_capability + (bit offset in x86_capability / 64) * 8
> 
> Since base address of x86_capability may not be aligned to unsigned long,
> the single unsigned long location may cross two cache lines and
> accessing the location by locked BTS/BTR introductions will trigger #AC.

That is not true.
The BTS/BTR instructions access the memory word that contains the
expected bit.
The 'operand size' only affects the size of the register use for the
bit offset.
If the 'operand size' is 16 bits wide (+/- 32k bit offset) the cpu might
do an aligned 16bit memory access, otherwise (32 or 64bit bit offset) it
might do an aligned 32 bit access.
It should never do an 64bit access and never a misaligned one (even if
the base address is misaligned).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
David Laight April 4, 2019, 4:24 p.m. UTC | #2
From: David Laight
> Sent: 04 April 2019 15:45
> 
> From: Fenghua Yu
> > Sent: 03 April 2019 22:22
> > set_cpu_cap() calls locked BTS and clear_cpu_cap() calls locked BTR to
> > operate on bitmap defined in x86_capability.
> >
> > Locked BTS/BTR accesses a single unsigned long location. In 64-bit mode,
> > the location is at:
> > base address of x86_capability + (bit offset in x86_capability / 64) * 8
> >
> > Since base address of x86_capability may not be aligned to unsigned long,
> > the single unsigned long location may cross two cache lines and
> > accessing the location by locked BTS/BTR introductions will trigger #AC.
> 
> That is not true.
> The BTS/BTR instructions access the memory word that contains the
> expected bit.
> The 'operand size' only affects the size of the register use for the
> bit offset.
> If the 'operand size' is 16 bits wide (+/- 32k bit offset) the cpu might
> do an aligned 16bit memory access, otherwise (32 or 64bit bit offset) it
> might do an aligned 32 bit access.
> It should never do an 64bit access and never a misaligned one (even if
> the base address is misaligned).

Hmmm... I may have misread things slightly.
The accessed address is 'Effective Address + (4 ∗ (BitOffset DIV 32))'.
However nothing suggests that it ever does 64bit accesses.

If it does do 64bit accesses when the operand size is 64 bits then the
asm stubs ought to be changed to only specify 32bit operand size.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Sean Christopherson April 4, 2019, 4:35 p.m. UTC | #3
On Thu, Apr 04, 2019 at 04:24:15PM +0000, David Laight wrote:
> From: David Laight
> > Sent: 04 April 2019 15:45
> > 
> > From: Fenghua Yu
> > > Sent: 03 April 2019 22:22
> > > set_cpu_cap() calls locked BTS and clear_cpu_cap() calls locked BTR to
> > > operate on bitmap defined in x86_capability.
> > >
> > > Locked BTS/BTR accesses a single unsigned long location. In 64-bit mode,
> > > the location is at:
> > > base address of x86_capability + (bit offset in x86_capability / 64) * 8
> > >
> > > Since base address of x86_capability may not be aligned to unsigned long,
> > > the single unsigned long location may cross two cache lines and
> > > accessing the location by locked BTS/BTR introductions will trigger #AC.
> > 
> > That is not true.
> > The BTS/BTR instructions access the memory word that contains the
> > expected bit.
> > The 'operand size' only affects the size of the register use for the
> > bit offset.
> > If the 'operand size' is 16 bits wide (+/- 32k bit offset) the cpu might
> > do an aligned 16bit memory access, otherwise (32 or 64bit bit offset) it
> > might do an aligned 32 bit access.
> > It should never do an 64bit access and never a misaligned one (even if
> > the base address is misaligned).
> 
> Hmmm... I may have misread things slightly.
> The accessed address is 'Effective Address + (4 ∗ (BitOffset DIV 32))'.
> However nothing suggests that it ever does 64bit accesses.
> 
> If it does do 64bit accesses when the operand size is 64 bits then the
> asm stubs ought to be changed to only specify 32bit operand size.

Heh, we had this discussion before[1], the op size dictates the size of
the memory access and can generate unaligned accesses.

[1] https://lkml.kernel.org/r/20181127195153.GE27075@linux.intel.com
Thomas Gleixner April 4, 2019, 4:52 p.m. UTC | #4
On Thu, 4 Apr 2019, David Laight wrote:
> From: David Laight Sent: 04 April 2019 15:45
> > From: Fenghua Yu Sent: 03 April 2019 22:22
> > That is not true.
> > The BTS/BTR instructions access the memory word that contains the
> > expected bit.
> > The 'operand size' only affects the size of the register use for the
> > bit offset.
> > If the 'operand size' is 16 bits wide (+/- 32k bit offset) the cpu might
> > do an aligned 16bit memory access, otherwise (32 or 64bit bit offset) it
> > might do an aligned 32 bit access.
> > It should never do an 64bit access and never a misaligned one (even if
> > the base address is misaligned).
> 
> Hmmm... I may have misread things slightly.
> The accessed address is 'Effective Address + (4 ∗ (BitOffset DIV 32))'.
> However nothing suggests that it ever does 64bit accesses.
>
> If it does do 64bit accesses when the operand size is 64 bits then the
> asm stubs ought to be changed to only specify 32bit operand size.

bitops operate on unsigned long arrays, so the RMW on the affected array
member has to be atomic vs. other RMW operations on the same array
member. If we make the bitops 32bit wide on x86/64 we break that.

So selecting 64bit access (REX.W prefix) is correct and has to stay.

Thanks,

	tglx
Paolo Bonzini April 4, 2019, 5:29 p.m. UTC | #5
On 04/04/19 18:52, Thomas Gleixner wrote:
> On Thu, 4 Apr 2019, David Laight wrote:
>> From: David Laight Sent: 04 April 2019 15:45
>>> From: Fenghua Yu Sent: 03 April 2019 22:22
>>> That is not true.
>>> The BTS/BTR instructions access the memory word that contains the
>>> expected bit.
>>> The 'operand size' only affects the size of the register use for the
>>> bit offset.
>>> If the 'operand size' is 16 bits wide (+/- 32k bit offset) the cpu might
>>> do an aligned 16bit memory access, otherwise (32 or 64bit bit offset) it
>>> might do an aligned 32 bit access.
>>> It should never do an 64bit access and never a misaligned one (even if
>>> the base address is misaligned).
>>
>> Hmmm... I may have misread things slightly.
>> The accessed address is 'Effective Address + (4 ∗ (BitOffset DIV 32))'.
>> However nothing suggests that it ever does 64bit accesses.
>>
>> If it does do 64bit accesses when the operand size is 64 bits then the
>> asm stubs ought to be changed to only specify 32bit operand size.
> 
> bitops operate on unsigned long arrays, so the RMW on the affected array
> member has to be atomic vs. other RMW operations on the same array
> member. If we make the bitops 32bit wide on x86/64 we break that.
> 
> So selecting 64bit access (REX.W prefix) is correct and has to stay.

Aren't bitops always atomic with respect to the whole cache line(s)?  We
regularly rely on cmpxchgl being atomic with respect to movb.

Paolo
Thomas Gleixner April 4, 2019, 6:11 p.m. UTC | #6
On Thu, 4 Apr 2019, Paolo Bonzini wrote:

> On 04/04/19 18:52, Thomas Gleixner wrote:
> > On Thu, 4 Apr 2019, David Laight wrote:
> >> From: David Laight Sent: 04 April 2019 15:45
> >>> From: Fenghua Yu Sent: 03 April 2019 22:22
> >>> That is not true.
> >>> The BTS/BTR instructions access the memory word that contains the
> >>> expected bit.
> >>> The 'operand size' only affects the size of the register use for the
> >>> bit offset.
> >>> If the 'operand size' is 16 bits wide (+/- 32k bit offset) the cpu might
> >>> do an aligned 16bit memory access, otherwise (32 or 64bit bit offset) it
> >>> might do an aligned 32 bit access.
> >>> It should never do an 64bit access and never a misaligned one (even if
> >>> the base address is misaligned).
> >>
> >> Hmmm... I may have misread things slightly.
> >> The accessed address is 'Effective Address + (4 ∗ (BitOffset DIV 32))'.
> >> However nothing suggests that it ever does 64bit accesses.
> >>
> >> If it does do 64bit accesses when the operand size is 64 bits then the
> >> asm stubs ought to be changed to only specify 32bit operand size.
> > 
> > bitops operate on unsigned long arrays, so the RMW on the affected array
> > member has to be atomic vs. other RMW operations on the same array
> > member. If we make the bitops 32bit wide on x86/64 we break that.
> > 
> > So selecting 64bit access (REX.W prefix) is correct and has to stay.
> 
> Aren't bitops always atomic with respect to the whole cache line(s)?  We
> regularly rely on cmpxchgl being atomic with respect to movb.

Yes, but if your long goes across a cacheline then you have lost due to the
requirement to lock both cachelines. Same problem as with bitops and I
rather catch all of those than just some.

Thanks,

	tglx
David Laight April 5, 2019, 9:23 a.m. UTC | #7
From: Thomas Gleixner [mailto:tglx@linutronix.de]
> 
> On Thu, 4 Apr 2019, David Laight wrote:
...
> bitops operate on unsigned long arrays, so the RMW on the affected array
> member has to be atomic vs. other RMW operations on the same array
> member. If we make the bitops 32bit wide on x86/64 we break that.
> 
> So selecting 64bit access (REX.W prefix) is correct and has to stay.

Even if the lock had a fine granularity it shouldn't matter whether
two RMW operations on different parts of the same 'long' happened
at the same time.
ISTR some of the 'constant bit offset' are done with byte sized
locked transfers.

The problem here is that the bitmap is defined as __u32[] not
unsigned long[].

-	__u32			x86_capability[NCAPINTS + NBUGINTS];
+	/* Unsigned long alignment to avoid split lock in atomic bitmap ops */
+	__u32			x86_capability[NCAPINTS + NBUGINTS]
+				__aligned(sizeof(unsigned long));

Somewhere there is a cast.....

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
diff mbox series

Patch

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 2bb3a648fc12..7c62b9ad6e5a 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -93,7 +93,9 @@  struct cpuinfo_x86 {
 	__u32			extended_cpuid_level;
 	/* Maximum supported CPUID level, -1=no CPUID: */
 	int			cpuid_level;
-	__u32			x86_capability[NCAPINTS + NBUGINTS];
+	/* Aligned to size of unsigned long to avoid split lock in atomic ops */
+	__u32			x86_capability[NCAPINTS + NBUGINTS]
+				__aligned(sizeof(unsigned long));
 	char			x86_vendor_id[16];
 	char			x86_model_id[64];
 	/* in KB - valid for CPUS which support this call: */