diff mbox series

[3/3] arm64: smp: Treat unknown boot failures as being 'stuck in kernel'

Message ID 20190827151815.2160-4-will@kernel.org (mailing list archive)
State Mainlined
Commit ebef746543fd1aa162216b0e484eb9062b65741d
Headers show
Series Try to make SMP booting slightly less fragile | expand

Commit Message

Will Deacon Aug. 27, 2019, 3:18 p.m. UTC
When we fail to bring a secondary CPU online and it fails in an unknown
state, we should assume the worst and increment 'cpus_stuck_in_kernel'
so that things like kexec() are disabled.

Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/kernel/smp.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Mark Rutland Aug. 27, 2019, 3:59 p.m. UTC | #1
On Tue, Aug 27, 2019 at 04:18:15PM +0100, Will Deacon wrote:
> When we fail to bring a secondary CPU online and it fails in an unknown
> state, we should assume the worst and increment 'cpus_stuck_in_kernel'
> so that things like kexec() are disabled.

Definitely! I has assumed we already did this, but I see that we don't.

> Signed-off-by: Will Deacon <will@kernel.org>

I don't see a nicer way of doing this, so:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Thanks,
Mark.

> ---
>  arch/arm64/kernel/smp.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 1f8aeb77cba5..dc9fe879c279 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -147,6 +147,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>  		default:
>  			pr_err("CPU%u: failed in unknown state : 0x%lx\n",
>  					cpu, status);
> +			cpus_stuck_in_kernel++;
>  			break;
>  		case CPU_KILL_ME:
>  			if (!op_cpu_kill(cpu)) {
> -- 
> 2.11.0
>
diff mbox series

Patch

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 1f8aeb77cba5..dc9fe879c279 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -147,6 +147,7 @@  int __cpu_up(unsigned int cpu, struct task_struct *idle)
 		default:
 			pr_err("CPU%u: failed in unknown state : 0x%lx\n",
 					cpu, status);
+			cpus_stuck_in_kernel++;
 			break;
 		case CPU_KILL_ME:
 			if (!op_cpu_kill(cpu)) {