diff mbox series

[v1] RISC-V: Only default to spinwait on SBI-0.1 and M-mode

Message ID 20220421170354.10555-1-palmer@rivosinc.com (mailing list archive)
State New, archived
Headers show
Series [v1] RISC-V: Only default to spinwait on SBI-0.1 and M-mode | expand

Commit Message

Palmer Dabbelt April 21, 2022, 5:03 p.m. UTC
From: Palmer Dabbelt <palmer@rivosinc.com>

The spinwait boot method has been superseeded by the SBI HSM extension
for some time now, but it still enabled by default.  This causes some
issues on large hart count systems, which will hang if a physical hart
exists that is larger than NR_CPUS.

Users on modern SBI implemenation don't need spinwait, and while it's
probably possible to deal with some of the spinwait issues let's just
restrict the default to systems that are likely to actually use it.

Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
---
This seems to be the source of many of my new hangs when trying to test
the NR_CPUS=512 support.  It's not really related, just fallout from
testing different setups.
---
 arch/riscv/Kconfig | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Atish Patra April 21, 2022, 9:16 p.m. UTC | #1
On Thu, Apr 21, 2022 at 10:05 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>
> From: Palmer Dabbelt <palmer@rivosinc.com>
>
> The spinwait boot method has been superseeded by the SBI HSM extension
> for some time now, but it still enabled by default.  This causes some
> issues on large hart count systems, which will hang if a physical hart
> exists that is larger than NR_CPUS.
>
> Users on modern SBI implemenation don't need spinwait, and while it's
> probably possible to deal with some of the spinwait issues let's just
> restrict the default to systems that are likely to actually use it.
>
> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
> ---
> This seems to be the source of many of my new hangs when trying to test
> the NR_CPUS=512 support.  It's not really related, just fallout from
> testing different setups.
> ---
>  arch/riscv/Kconfig | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 00fd9c548f26..dd5e975abe37 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -359,7 +359,7 @@ config RISCV_SBI_V01
>  config RISCV_BOOT_SPINWAIT
>         bool "Spinwait booting method"
>         depends on SMP
> -       default y
> +       default y if RISCV_SBI_V01 || RISCV_M_MODE
>         help
>           This enables support for booting Linux via spinwait method. In the
>           spinwait method, all cores randomly jump to Linux. One of the cores
> @@ -370,6 +370,12 @@ config RISCV_BOOT_SPINWAIT
>           rely on ordered booting via SBI HSM extension which gets chosen
>           dynamically at runtime if the firmware supports it.
>
> +         Since spinwait is incompatible with sparse hart IDs, it requires
> +         NR_CPUS be large enough to contain the physical hart ID of the first
> +         hart to enter Linux.
> +
> +         If unsure what to do here, say N.
> +
>  config KEXEC
>         bool "Kexec system call"
>         select KEXEC_CORE
> --
> 2.34.1
>


Reviewed-by: Atish Patra <atishp@rivosinc.com>
Anup Patel April 22, 2022, 4:52 a.m. UTC | #2
On Thu, Apr 21, 2022 at 10:35 PM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>
> From: Palmer Dabbelt <palmer@rivosinc.com>
>
> The spinwait boot method has been superseeded by the SBI HSM extension
> for some time now, but it still enabled by default.  This causes some
> issues on large hart count systems, which will hang if a physical hart
> exists that is larger than NR_CPUS.
>
> Users on modern SBI implemenation don't need spinwait, and while it's
> probably possible to deal with some of the spinwait issues let's just
> restrict the default to systems that are likely to actually use it.
>
> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

Looks good to me.

Reviewed-by: Anup Patel <anup@brainfault.org>

Regards,
Anup

> ---
> This seems to be the source of many of my new hangs when trying to test
> the NR_CPUS=512 support.  It's not really related, just fallout from
> testing different setups.
> ---
>  arch/riscv/Kconfig | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 00fd9c548f26..dd5e975abe37 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -359,7 +359,7 @@ config RISCV_SBI_V01
>  config RISCV_BOOT_SPINWAIT
>         bool "Spinwait booting method"
>         depends on SMP
> -       default y
> +       default y if RISCV_SBI_V01 || RISCV_M_MODE
>         help
>           This enables support for booting Linux via spinwait method. In the
>           spinwait method, all cores randomly jump to Linux. One of the cores
> @@ -370,6 +370,12 @@ config RISCV_BOOT_SPINWAIT
>           rely on ordered booting via SBI HSM extension which gets chosen
>           dynamically at runtime if the firmware supports it.
>
> +         Since spinwait is incompatible with sparse hart IDs, it requires
> +         NR_CPUS be large enough to contain the physical hart ID of the first
> +         hart to enter Linux.
> +
> +         If unsure what to do here, say N.
> +
>  config KEXEC
>         bool "Kexec system call"
>         select KEXEC_CORE
> --
> 2.34.1
>
Palmer Dabbelt June 2, 2022, 2:08 a.m. UTC | #3
On Thu, 21 Apr 2022 10:03:55 PDT (-0700), Palmer Dabbelt wrote:
> From: Palmer Dabbelt <palmer@rivosinc.com>
>
> The spinwait boot method has been superseeded by the SBI HSM extension
> for some time now, but it still enabled by default.  This causes some
> issues on large hart count systems, which will hang if a physical hart
> exists that is larger than NR_CPUS.
>
> Users on modern SBI implemenation don't need spinwait, and while it's
> probably possible to deal with some of the spinwait issues let's just
> restrict the default to systems that are likely to actually use it.
>
> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
> ---
> This seems to be the source of many of my new hangs when trying to test
> the NR_CPUS=512 support.  It's not really related, just fallout from
> testing different setups.
> ---
>  arch/riscv/Kconfig | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 00fd9c548f26..dd5e975abe37 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -359,7 +359,7 @@ config RISCV_SBI_V01
>  config RISCV_BOOT_SPINWAIT
>  	bool "Spinwait booting method"
>  	depends on SMP
> -	default y
> +	default y if RISCV_SBI_V01 || RISCV_M_MODE
>  	help
>  	  This enables support for booting Linux via spinwait method. In the
>  	  spinwait method, all cores randomly jump to Linux. One of the cores
> @@ -370,6 +370,12 @@ config RISCV_BOOT_SPINWAIT
>  	  rely on ordered booting via SBI HSM extension which gets chosen
>  	  dynamically at runtime if the firmware supports it.
>
> +	  Since spinwait is incompatible with sparse hart IDs, it requires
> +	  NR_CPUS be large enough to contain the physical hart ID of the first
> +	  hart to enter Linux.
> +
> +	  If unsure what to do here, say N.
> +
>  config KEXEC
>  	bool "Kexec system call"
>  	select KEXEC_CORE

This is on for-next, with some typos fixed.
diff mbox series

Patch

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 00fd9c548f26..dd5e975abe37 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -359,7 +359,7 @@  config RISCV_SBI_V01
 config RISCV_BOOT_SPINWAIT
 	bool "Spinwait booting method"
 	depends on SMP
-	default y
+	default y if RISCV_SBI_V01 || RISCV_M_MODE
 	help
 	  This enables support for booting Linux via spinwait method. In the
 	  spinwait method, all cores randomly jump to Linux. One of the cores
@@ -370,6 +370,12 @@  config RISCV_BOOT_SPINWAIT
 	  rely on ordered booting via SBI HSM extension which gets chosen
 	  dynamically at runtime if the firmware supports it.
 
+	  Since spinwait is incompatible with sparse hart IDs, it requires
+	  NR_CPUS be large enough to contain the physical hart ID of the first
+	  hart to enter Linux.
+
+	  If unsure what to do here, say N.
+
 config KEXEC
 	bool "Kexec system call"
 	select KEXEC_CORE