diff mbox series

arm64: abort SDEI handlers during crash

Message ID 20230204000851.3871-1-scott@os.amperecomputing.com (mailing list archive)
State New, archived
Headers show
Series arm64: abort SDEI handlers during crash | expand

Commit Message

D Scott Phillips Feb. 4, 2023, 12:08 a.m. UTC
Interrupts are blocked in SDEI context, per the SDEI spec: "The client
interrupts cannot preempt the event handler." If we crashed in the SDEI
handler-running context (as with ACPI's AGDI) then we need to clean up the
SDEI state before proceeding to the crash kernel so that the crash kernel
can have working interrupts.  Try two COMPLETE_AND_RESUMEs in case both a
normal and critical event were being handled.

Fixes: f5df26961853 ("arm64: kernel: Add arch-specific SDEI entry code and CPU masking")
Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
---
 arch/arm64/include/asm/sdei.h |  3 +++
 arch/arm64/kernel/entry.S     | 24 +++++++++++++++++++++---
 arch/arm64/kernel/smp.c       | 14 ++++++++++----
 3 files changed, 34 insertions(+), 7 deletions(-)

Comments

James Morse Feb. 10, 2023, 6:19 p.m. UTC | #1
Hi Scott,

On 04/02/2023 00:08, D Scott Phillips wrote:
> Interrupts are blocked in SDEI context, per the SDEI spec: "The client
> interrupts cannot preempt the event handler."

Firmware is supposed to do this by ensuring PSTATE.I is set when the handler runs.
See "4.3.1 Arm PE Architecture".

Unfortunately trusted-firmware is setting PMR, (which is specific to GIC, not the 'PE
architecture') to a value in the secure range, meaning no normal world interrupt can fire.

This is a bug in trusted firmware, but it looks like its firmly ingrained in their
'exception handling framework', and there is no appetite for fixing it.

So we'll have to hack round it in the kernel.


> If we crashed in the SDEI
> handler-running context (as with ACPI's AGDI) then we need to clean up the
> SDEI state before proceeding to the crash kernel so that the crash kernel
> can have working interrupts.  Try two COMPLETE_AND_RESUMEs in case both a
> normal and critical event were being handled.

There are multiple reasons the kernel might panic(), doing this in crash_smp_send_stop()
is a good call.


> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 11cb99c4d298..03dc233bdaa1 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -909,13 +909,18 @@ NOKPROBE(call_on_irq_stack)
>  #include <asm/sdei.h>
>  #include <uapi/linux/arm_sdei.h>
>  
> -.macro sdei_handler_exit exit_mode
> -	/* On success, this call never returns... */
> +.macro sdei_handler_exit_fallthrough exit_mode
>  	cmp	\exit_mode, #SDEI_EXIT_SMC
>  	b.ne	99f
>  	smc	#0
> -	b	.
> +	b	100f
>  99:	hvc	#0
> +100:
> +.endm

This was a tangled mess before, but now that it can fallthrough we should try harder.
Could you look at smccc_patch_fw_mitigation_conduit for an example of how this can be done
cleanly. That would allow sdei_exit_mode to be removed.
(if that makes sense to do, please do it as a preparatory patch)


> @@ -1077,4 +1082,17 @@ alternative_else_nop_endif

> +SYM_CODE_START(sdei_handler_abort)
> +	mov_q	x0, SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME
> +	adr	x1, 1f
> +	ldr_l	x2, sdei_exit_mode
> +	sdei_handler_exit_fallthrough exit_mode=x2
> +	// either fallthrough if not in handler context, or exit the handler
> +	// and jump to the next instruction. Exit will stomp x0-x17, PSTATE,
> +	// ELR_ELx, and SPSR_ELx.
> +1:	ret
> +SYM_CODE_END(sdei_handler_abort)
> +NOKPROBE(sdei_handler_abort)
> +
>  #endif /* CONFIG_ARM_SDE_INTERFACE */

> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index ffc5d76cf695..bc1b3000197e 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -1047,10 +1047,8 @@ void crash_smp_send_stop(void)
>  	 * If this cpu is the only one alive at this point in time, online or
>  	 * not, there are no stop messages to be sent around, so just back out.
>  	 */
> -	if (num_other_online_cpus() == 0) {
> -		sdei_mask_local_cpu();
> -		return;
> -	}
> +	if (num_other_online_cpus() == 0)
> +		goto skip_ipi;
>  
>  	cpumask_copy(&mask, cpu_online_mask);
>  	cpumask_clear_cpu(smp_processor_id(), &mask);
> @@ -1069,7 +1067,15 @@ void crash_smp_send_stop(void)
>  		pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
>  			cpumask_pr_args(&mask));
>  
> +skip_ipi:
>  	sdei_mask_local_cpu();
> +	/*
> +	 * The crash may have happened in a critical event handler which
> +	 * preempted a normal handler. So at most we might have two
> +	 * levels of SDEI context to exit.
> +	 */
> +	sdei_handler_abort();
> +	sdei_handler_abort();

And if SDEI wasn't supported? Before SMC-CC you couldn't probe for firmware calls, you had
to know they were implemented. Its entirely possible there are platforms out there that
corrupt more than x0-x17 when you do this.
(also, what happens on machines where the kernel runs at EL2, and there is no EL3?)

You can tell if the kernel is in the 'middle' of an SDEI event based on the stack.

As this is a firmware bug, I'd like to print a warning. Something has gone wrong already,
otherwise we wouldn't be panic()ing, I think its important to identify this extra code is
running - in case it leads to more problems. (and not run it on platforms that aren't
affected).

Could we check PMR_EL1 to see if its got a value outside the kernel's PMR range before
triggering any of this. This would also let you know if you need to do this twice.


Thanks,

James
D Scott Phillips Feb. 10, 2023, 10:16 p.m. UTC | #2
James Morse <james.morse@arm.com> writes:

> Hi Scott,
>
> On 04/02/2023 00:08, D Scott Phillips wrote:
>> Interrupts are blocked in SDEI context, per the SDEI spec: "The client
>> interrupts cannot preempt the event handler."
>
> Firmware is supposed to do this by ensuring PSTATE.I is set when the
> handler runs.  See "4.3.1 Arm PE Architecture".
>
> Unfortunately trusted-firmware is setting PMR, (which is specific to
> GIC, not the 'PE architecture') to a value in the secure range,
> meaning no normal world interrupt can fire.

Hi James, in the case I see on my machine, the cause isn't an elevated
PMR, but rather an eleveated RPR. IRQs are blocked because the interrupt
that triggered the SDEI event is still active and hasn't dropped
priority on the core that's running the SDEI event handler. Specifically
see ICC_PMR_EL1 == 0xf8 (all priority bits set), but ICC_RPR_EL1 ==
0x70.

The spec does say that PSTATE.I will be set in the handler context, and
also that you "shouldn't" clear PSTATE.I while in the handler, but I
don't read that section or others in the spec as requiring firmware to
maintain the running priority, priority mask, etc such that just
clearing PSTATE.I gets you back to a working system.

Perhaps I'm reading into the "5.2 Event context" bit that says "The
client interrupts cannot preempt the event handler" more than I should,
but I'm taking that to mean, "don't count on working interrupts until
you end the handler"

In my AGDI case, the irq related to the event is entirely owned by
firmware, but in the case of irqs bound to SDEI events, the spec says
that the firmware will activate the interrupt before calling the event
handler and end the interrupt after. It's not explicit about dropping
priority, but it is explicit that the interrupt is meant to be active
during the sdei handler. So I'm not sure what the firmware is doing with
its AGDI interrupt is totally wrong here.

> This is a bug in trusted firmware, but it looks like its firmly
> ingrained in their 'exception handling framework', and there is no
> appetite for fixing it.
>
> So we'll have to hack round it in the kernel.
>
>
>> If we crashed in the SDEI
>> handler-running context (as with ACPI's AGDI) then we need to clean up the
>> SDEI state before proceeding to the crash kernel so that the crash kernel
>> can have working interrupts.  Try two COMPLETE_AND_RESUMEs in case both a
>> normal and critical event were being handled.
>
> There are multiple reasons the kernel might panic(), doing this in crash_smp_send_stop()
> is a good call.
>
>
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index 11cb99c4d298..03dc233bdaa1 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -909,13 +909,18 @@ NOKPROBE(call_on_irq_stack)
>>  #include <asm/sdei.h>
>>  #include <uapi/linux/arm_sdei.h>
>>  
>> -.macro sdei_handler_exit exit_mode
>> -	/* On success, this call never returns... */
>> +.macro sdei_handler_exit_fallthrough exit_mode
>>  	cmp	\exit_mode, #SDEI_EXIT_SMC
>>  	b.ne	99f
>>  	smc	#0
>> -	b	.
>> +	b	100f
>>  99:	hvc	#0
>> +100:
>> +.endm
>
> This was a tangled mess before, but now that it can fallthrough we should try harder.
> Could you look at smccc_patch_fw_mitigation_conduit for an example of how this can be done
> cleanly. That would allow sdei_exit_mode to be removed.
> (if that makes sense to do, please do it as a preparatory patch)

OK, will do.

>
>> @@ -1077,4 +1082,17 @@ alternative_else_nop_endif
>
>> +SYM_CODE_START(sdei_handler_abort)
>> +	mov_q	x0, SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME
>> +	adr	x1, 1f
>> +	ldr_l	x2, sdei_exit_mode
>> +	sdei_handler_exit_fallthrough exit_mode=x2
>> +	// either fallthrough if not in handler context, or exit the handler
>> +	// and jump to the next instruction. Exit will stomp x0-x17, PSTATE,
>> +	// ELR_ELx, and SPSR_ELx.
>> +1:	ret
>> +SYM_CODE_END(sdei_handler_abort)
>> +NOKPROBE(sdei_handler_abort)
>> +
>>  #endif /* CONFIG_ARM_SDE_INTERFACE */
>
>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>> index ffc5d76cf695..bc1b3000197e 100644
>> --- a/arch/arm64/kernel/smp.c
>> +++ b/arch/arm64/kernel/smp.c
>> @@ -1047,10 +1047,8 @@ void crash_smp_send_stop(void)
>>  	 * If this cpu is the only one alive at this point in time, online or
>>  	 * not, there are no stop messages to be sent around, so just back out.
>>  	 */
>> -	if (num_other_online_cpus() == 0) {
>> -		sdei_mask_local_cpu();
>> -		return;
>> -	}
>> +	if (num_other_online_cpus() == 0)
>> +		goto skip_ipi;
>>  
>>  	cpumask_copy(&mask, cpu_online_mask);
>>  	cpumask_clear_cpu(smp_processor_id(), &mask);
>> @@ -1069,7 +1067,15 @@ void crash_smp_send_stop(void)
>>  		pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
>>  			cpumask_pr_args(&mask));
>>  
>> +skip_ipi:
>>  	sdei_mask_local_cpu();
>> +	/*
>> +	 * The crash may have happened in a critical event handler which
>> +	 * preempted a normal handler. So at most we might have two
>> +	 * levels of SDEI context to exit.
>> +	 */
>> +	sdei_handler_abort();
>> +	sdei_handler_abort();
>
> And if SDEI wasn't supported? Before SMC-CC you couldn't probe for firmware calls, you had
> to know they were implemented. Its entirely possible there are platforms out there that
> corrupt more than x0-x17 when you do this.
> (also, what happens on machines where the kernel runs at EL2, and there is no EL3?)
>
> You can tell if the kernel is in the 'middle' of an SDEI event based on the stack.

Ah, all good points. The case I was worried about was an SDEI normal
event that itself got preempted by a critical event before it switched
sp to the sdei stack, or after it had switched back but hadn't ended,
but you're right, "just try and see" is a really a non-solution.

How about:

if (sp within sdei_shadow_call_stack_normal ||
    sp within sdei_shadow_call_stack_critical)
  sdei_handler_abort()
if (sp within sdei_shadow_call_stack_critical &&
    (interrupted_regs.sp within sdei_shadow_call_stack_normal ||
     interrupted_regs.pc within __sdei_asm_handler))
  sdei_handler_abort()

> As this is a firmware bug, I'd like to print a warning. Something has
> gone wrong already, otherwise we wouldn't be panic()ing, I think its
> important to identify this extra code is running - in case it leads to
> more problems. (and not run it on platforms that aren't affected).

Yep, good idea, will do.

> Could we check PMR_EL1 to see if its got a value outside the kernel's
> PMR range before triggering any of this. This would also let you know
> if you need to do this twice.

PMR and RPR? What if the firmware fiddled with interrupts some other
way? Group enables? Maybe that's too paranoid and it's better to just
deal with firmware we've got.

>
> Thanks,
>
> James
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Mihai Carabas April 20, 2023, 11:50 a.m. UTC | #3
Tested-by: Mihai Carabas <mihai.carabas@oracle.com>
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/sdei.h b/arch/arm64/include/asm/sdei.h
index 4292d9bafb9d..1030568db7d3 100644
--- a/arch/arm64/include/asm/sdei.h
+++ b/arch/arm64/include/asm/sdei.h
@@ -29,6 +29,9 @@  asmlinkage void __sdei_asm_entry_trampoline(unsigned long event_num,
 						   unsigned long pc,
 						   unsigned long pstate);
 
+/* End a possibly still running handler. Context is discarded. */
+void sdei_handler_abort(void);
+
 /*
  * The above entry point does the minimum to call C code. This function does
  * anything else, before calling the driver.
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 11cb99c4d298..03dc233bdaa1 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -909,13 +909,18 @@  NOKPROBE(call_on_irq_stack)
 #include <asm/sdei.h>
 #include <uapi/linux/arm_sdei.h>
 
-.macro sdei_handler_exit exit_mode
-	/* On success, this call never returns... */
+.macro sdei_handler_exit_fallthrough exit_mode
 	cmp	\exit_mode, #SDEI_EXIT_SMC
 	b.ne	99f
 	smc	#0
-	b	.
+	b	100f
 99:	hvc	#0
+100:
+.endm
+
+.macro sdei_handler_exit exit_mode
+	sdei_handler_exit_fallthrough exit_mode=\exit_mode
+	/* On success, this call never returns... */
 	b	.
 .endm
 
@@ -1077,4 +1082,17 @@  alternative_else_nop_endif
 #endif
 SYM_CODE_END(__sdei_asm_handler)
 NOKPROBE(__sdei_asm_handler)
+
+SYM_CODE_START(sdei_handler_abort)
+	mov_q	x0, SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME
+	adr	x1, 1f
+	ldr_l	x2, sdei_exit_mode
+	sdei_handler_exit_fallthrough exit_mode=x2
+	// either fallthrough if not in handler context, or exit the handler
+	// and jump to the next instruction. Exit will stomp x0-x17, PSTATE,
+	// ELR_ELx, and SPSR_ELx.
+1:	ret
+SYM_CODE_END(sdei_handler_abort)
+NOKPROBE(sdei_handler_abort)
+
 #endif /* CONFIG_ARM_SDE_INTERFACE */
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index ffc5d76cf695..bc1b3000197e 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -1047,10 +1047,8 @@  void crash_smp_send_stop(void)
 	 * If this cpu is the only one alive at this point in time, online or
 	 * not, there are no stop messages to be sent around, so just back out.
 	 */
-	if (num_other_online_cpus() == 0) {
-		sdei_mask_local_cpu();
-		return;
-	}
+	if (num_other_online_cpus() == 0)
+		goto skip_ipi;
 
 	cpumask_copy(&mask, cpu_online_mask);
 	cpumask_clear_cpu(smp_processor_id(), &mask);
@@ -1069,7 +1067,15 @@  void crash_smp_send_stop(void)
 		pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
 			cpumask_pr_args(&mask));
 
+skip_ipi:
 	sdei_mask_local_cpu();
+	/*
+	 * The crash may have happened in a critical event handler which
+	 * preempted a normal handler. So at most we might have two
+	 * levels of SDEI context to exit.
+	 */
+	sdei_handler_abort();
+	sdei_handler_abort();
 }
 
 bool smp_crash_stop_failed(void)