diff mbox

[v8,3/3] x86, mce: Add __mcsafe_copy()

Message ID CA+8MBbKdH8v=gkTqzxpPRX9-jBEobU9XaEfZh=4cOXDjPE9fBA@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tony Luck Jan. 30, 2016, 12:35 a.m. UTC
On Wed, Jan 13, 2016 at 8:39 PM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Jan 13, 2016 at 03:22:58PM -0800, Tony Luck wrote:
>> Are there some examples of synthetic CPUID bits?
>
> X86_FEATURE_ALWAYS is one. The others got renamed into X86_BUG_* ones,
> the remaining mechanism is the same, though.

So something like this [gmail will line wrap, but should still be legible]

Then Dan will be able to use:

      if (cpu_has(c, X86_FEATURE_MCRECOVERY))

to decide whether to use the (slightly slower, but recovery capable)
__mcsafe_copy()
or just pick the fastest memcpy() instead.

-Tony


                pr_emerg("Couldn't allocate MCE records pool!\n");

Comments

Borislav Petkov Jan. 30, 2016, 10:28 a.m. UTC | #1
On Fri, Jan 29, 2016 at 04:35:35PM -0800, Tony Luck wrote:
> On Wed, Jan 13, 2016 at 8:39 PM, Borislav Petkov <bp@alien8.de> wrote:
> > On Wed, Jan 13, 2016 at 03:22:58PM -0800, Tony Luck wrote:
> >> Are there some examples of synthetic CPUID bits?
> >
> > X86_FEATURE_ALWAYS is one. The others got renamed into X86_BUG_* ones,
> > the remaining mechanism is the same, though.
> 
> So something like this [gmail will line wrap, but should still be legible]
> 
> Then Dan will be able to use:
> 
>       if (cpu_has(c, X86_FEATURE_MCRECOVERY))
> 
> to decide whether to use the (slightly slower, but recovery capable)
> __mcsafe_copy()
> or just pick the fastest memcpy() instead.

The most optimal way of alternatively calling two functions would be
something like this, IMO:

alternative_call(memcpy, __mcsafe_copy, X86_FEATURE_MCRECOVERY,
		 ASM_OUTPUT2("=a" (mcsafe_ret.trapnr), "=d" (mcsafe_ret.remain)),
		 "D" (dst), "S" (src), "d" (len));

I hope I've not messed up the calling convention but you want the inputs
in %rdi, %rsi, %rdx and the outputs in %rax, %rdx, respectively. Just
check the asm gcc generates and do not trust me :)

The other thing you probably would need to do is create our own
__memcpy() which returns struct mcsafe_ret so that the signatures of
both functions match.

Yeah, it is a bit of jumping through hoops but this way we do a CALL
<func_ptr> directly in asm, without any JMPs or NOPs padding the other
alternatives methods add.

But if you don't care about a small JMP and that is not a hot path, you
could do the simpler:

	if (static_cpu_has(X86_FEATURE_MCRECOVERY))
		return __mcsafe_copy(...);

	return memcpy();

which adds a JMP or a 5-byte NOP depending on the X86_FEATURE_MCRECOVERY
setting.

> diff --git a/arch/x86/include/asm/cpufeature.h
> b/arch/x86/include/asm/cpufeature.h
> index 7ad8c9464297..621e05103633 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -106,6 +106,7 @@
>  #define X86_FEATURE_APERFMPERF ( 3*32+28) /* APERFMPERF */
>  #define X86_FEATURE_EAGER_FPU  ( 3*32+29) /* "eagerfpu" Non lazy FPU restore */
>  #define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in
> S3 state */
> +#define X86_FEATURE_MCRECOVERY ( 3*32+31) /* cpu has recoverable

Why not write it out?

	X86_FEATURE_MCE_RECOVERY

> machine checks */
> 
>  /* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
>  #define X86_FEATURE_XMM3       ( 4*32+ 0) /* "pni" SSE-3 */
Tony Luck Feb. 1, 2016, 11:10 p.m. UTC | #2
> The most optimal way of alternatively calling two functions would be
> something like this, IMO:
>
> alternative_call(memcpy, __mcsafe_copy, X86_FEATURE_MCRECOVERY,
>                  ASM_OUTPUT2("=a" (mcsafe_ret.trapnr), "=d" (mcsafe_ret.remain)),
>                  "D" (dst), "S" (src), "d" (len));
>
> I hope I've not messed up the calling convention but you want the inputs
> in %rdi, %rsi, %rdx and the outputs in %rax, %rdx, respectively. Just
> check the asm gcc generates and do not trust me :)
>
> The other thing you probably would need to do is create our own
> __memcpy() which returns struct mcsafe_ret so that the signatures of
> both functions match.
>
> Yeah, it is a bit of jumping through hoops but this way we do a CALL
> <func_ptr> directly in asm, without any JMPs or NOPs padding the other
> alternatives methods add.
>
> But if you don't care about a small JMP and that is not a hot path, you
> could do the simpler:
>
>         if (static_cpu_has(X86_FEATURE_MCRECOVERY))
>                 return __mcsafe_copy(...);
>
>         return memcpy();
>
> which adds a JMP or a 5-byte NOP depending on the X86_FEATURE_MCRECOVERY
> setting.

Dan,

What do you want the API to look like at the point you make a call
in the libnvdimm code?  Something like:

        r = nvcopy(dst, src, len);

where the innards of nvcopy() does the check for X86_FEATURE_MCE_RECOVERY?

What is useful to you in the return value? The low level __mcsafe_copy() returns
both a remainder and a trap number. But in your case I don't think you
need the trap
number (if the remaining count is not zero, then there must have been a #MC. #PF
isn't an option for you, right?

-Tony
Dan Williams Feb. 1, 2016, 11:16 p.m. UTC | #3
On Mon, Feb 1, 2016 at 3:10 PM, Tony Luck <tony.luck@gmail.com> wrote:
>> The most optimal way of alternatively calling two functions would be
>> something like this, IMO:
>>
>> alternative_call(memcpy, __mcsafe_copy, X86_FEATURE_MCRECOVERY,
>>                  ASM_OUTPUT2("=a" (mcsafe_ret.trapnr), "=d" (mcsafe_ret.remain)),
>>                  "D" (dst), "S" (src), "d" (len));
>>
>> I hope I've not messed up the calling convention but you want the inputs
>> in %rdi, %rsi, %rdx and the outputs in %rax, %rdx, respectively. Just
>> check the asm gcc generates and do not trust me :)
>>
>> The other thing you probably would need to do is create our own
>> __memcpy() which returns struct mcsafe_ret so that the signatures of
>> both functions match.
>>
>> Yeah, it is a bit of jumping through hoops but this way we do a CALL
>> <func_ptr> directly in asm, without any JMPs or NOPs padding the other
>> alternatives methods add.
>>
>> But if you don't care about a small JMP and that is not a hot path, you
>> could do the simpler:
>>
>>         if (static_cpu_has(X86_FEATURE_MCRECOVERY))
>>                 return __mcsafe_copy(...);
>>
>>         return memcpy();
>>
>> which adds a JMP or a 5-byte NOP depending on the X86_FEATURE_MCRECOVERY
>> setting.
>
> Dan,
>
> What do you want the API to look like at the point you make a call
> in the libnvdimm code?  Something like:
>
>         r = nvcopy(dst, src, len);
>
> where the innards of nvcopy() does the check for X86_FEATURE_MCE_RECOVERY?
>
> What is useful to you in the return value? The low level __mcsafe_copy() returns
> both a remainder and a trap number. But in your case I don't think you
> need the trap
> number (if the remaining count is not zero, then there must have been a #MC. #PF
> isn't an option for you, right?

RIght, we don't need a trap number just an error.  This is the v1
attempt at integrating mcsafe_copy:

https://lists.01.org/pipermail/linux-nvdimm/2016-January/003869.html

I think the only change needed is to use
static_cpu_has(X86_FEATURE_MCRECOVERY) like so:

+static inline int arch_memcpy_from_pmem(void *dst, const void __pmem *src,
+ size_t n)
+{
+ if (static_cpu_has(X86_FEATURE_MCRECOVERY)) {
+ struct mcsafe_ret ret;
+
+ ret = __mcsafe_copy(dst, (void __force *) src, n);
+ if (ret.remain)
+ return -EIO;
+ return 0;
+ }
+ memcpy(dst, (void __force *) src, n);
+ return 0;
+}
diff mbox

Patch

diff --git a/arch/x86/include/asm/cpufeature.h
b/arch/x86/include/asm/cpufeature.h
index 7ad8c9464297..621e05103633 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -106,6 +106,7 @@ 
 #define X86_FEATURE_APERFMPERF ( 3*32+28) /* APERFMPERF */
 #define X86_FEATURE_EAGER_FPU  ( 3*32+29) /* "eagerfpu" Non lazy FPU restore */
 #define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in
S3 state */
+#define X86_FEATURE_MCRECOVERY ( 3*32+31) /* cpu has recoverable
machine checks */

 /* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
 #define X86_FEATURE_XMM3       ( 4*32+ 0) /* "pni" SSE-3 */
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index a006f4cd792b..b8980d767240 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1694,6 +1694,14 @@  void mcheck_cpu_init(struct cpuinfo_x86 *c)
                return;
        }

+       /*
+        * MCG_CAP.MCG_SER_P is necessary but not sufficient to know
+        * whether this processor will actually generate recoverable
+        * machine checks. Check to see if this is an E7 model Xeon.
+        */
+       if (mca_cfg.ser && !strncmp(c->x86_model_id, "Intel(R) Xeon(R)
CPU E7-", 24))
+               set_cpu_cap(c, X86_FEATURE_MCRECOVERY);
+
        if (mce_gen_pool_init()) {
                mca_cfg.disabled = true;