diff mbox series

[v6,10/20] x86/split_lock: Handle #AC exception for split lock

Message ID 1554326526-172295-11-git-send-email-fenghua.yu@intel.com (mailing list archive)
State Not Applicable
Delegated to: Johannes Berg
Headers show
Series x86/split_lock: Enable split locked accesses detection | expand

Commit Message

Fenghua Yu April 3, 2019, 9:21 p.m. UTC
There may be different considerations on how to handle #AC for split lock,
e.g. how to handle system hang caused by split lock issue in firmware,
how to emulate faulting instruction, etc. We use a simple method to
handle user and kernel split lock and may extend the method in the future.

When #AC exception for split lock is triggered from user process, the
process is killed by SIGBUS. To execute the process properly, a user
application developer needs to fix the split lock issue.

When #AC exception for split lock is triggered from a kernel instruction,
disable split lock detection on local CPU and warn the split lock issue.
After the exception, the faulting instruction will be executed and kernel
execution continues. Split lock detection is only disabled on the local
CPU, not globally. It will be re-enabled if the CPU is offline and then
online or through sysfs interface.

A kernel/driver developer should check the warning, which contains helpful
faulting address, context, and callstack info, and fix the split lock
issues. Then further split lock issues may be captured and fixed.

After bit 29 in MSR_TEST_CTL is set to 1 in kernel, firmware inherits
the setting when firmware is executed in S4, S5, run time services, SMI,
etc. If there is a split lock operation in firmware, it will triggers
#AC and may hang the system depending on how firmware handles the #AC.
It's up to a firmware developer to fix split lock issues in firmware.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
 arch/x86/kernel/traps.c | 43 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

Comments

Thomas Gleixner April 4, 2019, 5:31 p.m. UTC | #1
On Wed, 3 Apr 2019, Fenghua Yu wrote:
> +dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code)
> +{
> +	unsigned int trapnr = X86_TRAP_AC;
> +	char str[] = "alignment check";
> +	int signr = SIGBUS;
> +
> +	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> +
> +	/*
> +	 * WARN*()s end up here; fix them up before we call the
> +	 * notifier chain.
> +	 */

How exactly is WARN*() ending up here?

> +	if (!user_mode(regs) && fixup_bug(regs, trapnr))

And that fixup_bug() check does what?

int fixup_bug(struct pt_regs *regs, int trapnr)
{
 	if (trapnr != X86_TRAP_UD)
                return 0;

Copy and paste from do_error_trap() ....

> +		return;
> +
> +	if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) ==
> +		       NOTIFY_STOP)
> +		return;
> +
> +	cond_local_irq_enable(regs);
> +	if (!user_mode(regs) &&
> +	    static_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) {
> +		/*
> +		 * Only split lock can generate #AC from kernel at this point.
> +		 * Warn and disable split lock detection on this CPU. The
> +		 * faulting instruction will be executed without generating
> +		 * another #AC fault. User needs to check the warning and
> +		 * fix the split lock issue in the faulting instruction.

  "User needs to check the warning and fix the issue ..."

I'm looking forward to all the fixes from Joe Users.

Please remove that sentence. It's useless. Users report warnings if at all
and the kernel developers who actually look at them surely don't need an
advice like that.

Thanks,

	tglx
Fenghua Yu April 4, 2019, 10:49 p.m. UTC | #2
On Thu, Apr 04, 2019 at 07:31:59PM +0200, Thomas Gleixner wrote:
> On Wed, 3 Apr 2019, Fenghua Yu wrote:
> > +dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code)
> > +{
> > +	unsigned int trapnr = X86_TRAP_AC;
> > +	char str[] = "alignment check";
> > +	int signr = SIGBUS;
> > +
> > +	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> > +
> > +	/*
> > +	 * WARN*()s end up here; fix them up before we call the
> > +	 * notifier chain.
> > +	 */
> 
> How exactly is WARN*() ending up here?
> 
> > +	if (!user_mode(regs) && fixup_bug(regs, trapnr))
> 
> And that fixup_bug() check does what?
> 
> int fixup_bug(struct pt_regs *regs, int trapnr)
> {
>  	if (trapnr != X86_TRAP_UD)
>                 return 0;
> 
> Copy and paste from do_error_trap() ....

As you can see, do_alignment_check() is copied from do_error_trap().
But seems this part of code is irrelevant to #AC handler.

So I will remove the "if (!user_mode(regs) && fixup_bug(regs, trapnr))"
and surrounding code, right?

> 
> > +		return;
> > +
> > +	if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) ==
> > +		       NOTIFY_STOP)
> > +		return;
> > +
> > +	cond_local_irq_enable(regs);
> > +	if (!user_mode(regs) &&
> > +	    static_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) {
> > +		/*
> > +		 * Only split lock can generate #AC from kernel at this point.
> > +		 * Warn and disable split lock detection on this CPU. The
> > +		 * faulting instruction will be executed without generating
> > +		 * another #AC fault. User needs to check the warning and
> > +		 * fix the split lock issue in the faulting instruction.
> 
>   "User needs to check the warning and fix the issue ..."
> 
> I'm looking forward to all the fixes from Joe Users.
> 
> Please remove that sentence. It's useless. Users report warnings if at all
> and the kernel developers who actually look at them surely don't need an
> advice like that.

Sure. Will do this.

Thanks.

-Fenghua
diff mbox series

Patch

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index d26f9e9c3d83..0ac992bfa287 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -61,6 +61,7 @@ 
 #include <asm/mpx.h>
 #include <asm/vm86.h>
 #include <asm/umip.h>
+#include <asm/cpu.h>
 
 #ifdef CONFIG_X86_64
 #include <asm/x86_init.h>
@@ -293,9 +294,49 @@  DO_ERROR(X86_TRAP_OLD_MF, SIGFPE,           0, NULL, "coprocessor segment overru
 DO_ERROR(X86_TRAP_TS,     SIGSEGV,          0, NULL, "invalid TSS",         invalid_TSS)
 DO_ERROR(X86_TRAP_NP,     SIGBUS,           0, NULL, "segment not present", segment_not_present)
 DO_ERROR(X86_TRAP_SS,     SIGBUS,           0, NULL, "stack segment",       stack_segment)
-DO_ERROR(X86_TRAP_AC,     SIGBUS,  BUS_ADRALN, NULL, "alignment check",     alignment_check)
 #undef IP
 
+dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code)
+{
+	unsigned int trapnr = X86_TRAP_AC;
+	char str[] = "alignment check";
+	int signr = SIGBUS;
+
+	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
+
+	/*
+	 * WARN*()s end up here; fix them up before we call the
+	 * notifier chain.
+	 */
+	if (!user_mode(regs) && fixup_bug(regs, trapnr))
+		return;
+
+	if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) ==
+		       NOTIFY_STOP)
+		return;
+
+	cond_local_irq_enable(regs);
+	if (!user_mode(regs) &&
+	    static_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) {
+		/*
+		 * Only split lock can generate #AC from kernel at this point.
+		 * Warn and disable split lock detection on this CPU. The
+		 * faulting instruction will be executed without generating
+		 * another #AC fault. User needs to check the warning and
+		 * fix the split lock issue in the faulting instruction.
+		 */
+		msr_clear_bit(MSR_TEST_CTL,
+			      TEST_CTL_ENABLE_SPLIT_LOCK_DETECT_SHIFT);
+		WARN_ONCE(1, "split lock operation detected\n");
+
+		return;
+	}
+
+	/* Handle #AC generated in any other cases. */
+	do_trap(X86_TRAP_AC, SIGBUS, "alignment check", regs,
+		error_code, BUS_ADRALN, NULL);
+}
+
 #ifdef CONFIG_VMAP_STACK
 __visible void __noreturn handle_stack_overflow(const char *message,
 						struct pt_regs *regs,