diff mbox series

[V4,6/8] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK

Message ID 20220908022506.1275799-7-guoren@kernel.org (mailing list archive)
State New, archived
Headers show
Series riscv: Add GENERIC_ENTRY, irq stack support | expand

Commit Message

Guo Ren Sept. 8, 2022, 2:25 a.m. UTC
From: Guo Ren <guoren@linux.alibaba.com>

Add independent irq stacks for percpu to prevent kernel stack overflows.
It is also compatible with VMAP_STACK by implementing
arch_alloc_vmap_stack.  Many architectures have supported
HAVE_IRQ_EXIT_ON_IRQ_STACK, riscv should follow up.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/Kconfig                   |  8 +++++
 arch/riscv/include/asm/irq.h         |  3 ++
 arch/riscv/include/asm/thread_info.h |  2 ++
 arch/riscv/include/asm/vmap_stack.h  | 28 ++++++++++++++++
 arch/riscv/kernel/entry.S            | 27 ++++++++++++++++
 arch/riscv/kernel/irq.c              | 48 ++++++++++++++++++++++++++--
 6 files changed, 114 insertions(+), 2 deletions(-)
 create mode 100644 arch/riscv/include/asm/vmap_stack.h

Comments

Sebastian Andrzej Siewior Sept. 8, 2022, 4:08 p.m. UTC | #1
On 2022-09-07 22:25:04 [-0400], guoren@kernel.org wrote:
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index a07bb3b73b5b..a8a12b4ba1a9 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -433,6 +433,14 @@ config FPU
>  
>  	  If you don't know what to do here, say Y.
>  
> +config IRQ_STACKS
> +	bool "Independent irq stacks"
> +	default y
> +	select HAVE_IRQ_EXIT_ON_IRQ_STACK
> +	help
> +	  Add independent irq stacks for percpu to prevent kernel stack overflows.
> +	  We may save some memory footprint by disabling IRQ_STACKS.

Do you really think that it is needed to save memory here? Avoiding
stack overflows in deep call chains is probably more important than
saving ~8KiB per CPU.

Sebastian
David Laight Sept. 9, 2022, 7:30 a.m. UTC | #2
From: Sebastian Andrzej Siewior
> Sent: 08 September 2022 17:08
> 
> On 2022-09-07 22:25:04 [-0400], guoren@kernel.org wrote:
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index a07bb3b73b5b..a8a12b4ba1a9 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -433,6 +433,14 @@ config FPU
> >
> >  	  If you don't know what to do here, say Y.
> >
> > +config IRQ_STACKS
> > +	bool "Independent irq stacks"
> > +	default y
> > +	select HAVE_IRQ_EXIT_ON_IRQ_STACK
> > +	help
> > +	  Add independent irq stacks for percpu to prevent kernel stack overflows.
> > +	  We may save some memory footprint by disabling IRQ_STACKS.
> 
> Do you really think that it is needed to save memory here? Avoiding
> stack overflows in deep call chains is probably more important than
> saving ~8KiB per CPU.

Particularly if a 64bit build is using small stacks.

Without static analysis of actual call chain depth it is
really difficult to trim the stack size.

I'd bet (a few beers) that the deepest stack use in inside
the console print code form a printk() (eg warn_on_once)
in an obscure error path somewhere.
This won't be hit during any normal testing.

I think that the analysis objtool does is getting close
to be able to generate the raw data that can be used for
static stack depth analysis.
You need the 'CFI' constants for indirect calls and
some assumptions about depth of recursive calls.
But apart from that the code to process the raw output
isn't that complex.

A nice task for someone with some spare time.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Guo Ren Sept. 11, 2022, 1:13 a.m. UTC | #3
On Fri, Sep 9, 2022 at 3:30 PM David Laight <David.Laight@aculab.com> wrote:
>
> From: Sebastian Andrzej Siewior
> > Sent: 08 September 2022 17:08
> >
> > On 2022-09-07 22:25:04 [-0400], guoren@kernel.org wrote:
> > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > > index a07bb3b73b5b..a8a12b4ba1a9 100644
> > > --- a/arch/riscv/Kconfig
> > > +++ b/arch/riscv/Kconfig
> > > @@ -433,6 +433,14 @@ config FPU
> > >
> > >       If you don't know what to do here, say Y.
> > >
> > > +config IRQ_STACKS
> > > +   bool "Independent irq stacks"
> > > +   default y
> > > +   select HAVE_IRQ_EXIT_ON_IRQ_STACK
> > > +   help
> > > +     Add independent irq stacks for percpu to prevent kernel stack overflows.
> > > +     We may save some memory footprint by disabling IRQ_STACKS.
> >
> > Do you really think that it is needed to save memory here? Avoiding
> > stack overflows in deep call chains is probably more important than
> > saving ~8KiB per CPU.
Original riscv is !IRQ_STACKS, I just give a config to make it back.
So I would add a CONFIG_EXPERT in the next version.

Actually, I have a similar opinion to you, IRQ_STACKS should be force
enabled. But as a new feature, we should give users a choice - use or
not.

>
> Particularly if a 64bit build is using small stacks.
>
> Without static analysis of actual call chain depth it is
> really difficult to trim the stack size.
>
> I'd bet (a few beers) that the deepest stack use in inside
> the console print code form a printk() (eg warn_on_once)
> in an obscure error path somewhere.
> This won't be hit during any normal testing.
That means stack overflow would be hidden a lot. But we could enable
VMAP_STACK & STACK_LEAK [1].

[1]: https://lore.kernel.org/lkml/20220907014809.919979-1-guoren@kernel.org/

>
> I think that the analysis objtool does is getting close
> to be able to generate the raw data that can be used for
> static stack depth analysis.
> You need the 'CFI' constants for indirect calls and
> some assumptions about depth of recursive calls.
> But apart from that the code to process the raw output
> isn't that complex.
>
> A nice task for someone with some spare time.
>
>         David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
diff mbox series

Patch

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index a07bb3b73b5b..a8a12b4ba1a9 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -433,6 +433,14 @@  config FPU
 
 	  If you don't know what to do here, say Y.
 
+config IRQ_STACKS
+	bool "Independent irq stacks"
+	default y
+	select HAVE_IRQ_EXIT_ON_IRQ_STACK
+	help
+	  Add independent irq stacks for percpu to prevent kernel stack overflows.
+	  We may save some memory footprint by disabling IRQ_STACKS.
+
 endmenu # "Platform type"
 
 menu "Kernel features"
diff --git a/arch/riscv/include/asm/irq.h b/arch/riscv/include/asm/irq.h
index e4c435509983..205e1c693dfd 100644
--- a/arch/riscv/include/asm/irq.h
+++ b/arch/riscv/include/asm/irq.h
@@ -13,5 +13,8 @@ 
 #include <asm-generic/irq.h>
 
 extern void __init init_IRQ(void);
+asmlinkage void call_on_stack(struct pt_regs *regs, ulong *sp,
+				     void (*fn)(struct pt_regs *), ulong tmp);
+asmlinkage void noinstr do_riscv_irq(struct pt_regs *regs);
 
 #endif /* _ASM_RISCV_IRQ_H */
diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
index 7de4fb96f0b5..043da8ccc7e6 100644
--- a/arch/riscv/include/asm/thread_info.h
+++ b/arch/riscv/include/asm/thread_info.h
@@ -40,6 +40,8 @@ 
 #define OVERFLOW_STACK_SIZE     SZ_4K
 #define SHADOW_OVERFLOW_STACK_SIZE (1024)
 
+#define IRQ_STACK_SIZE		THREAD_SIZE
+
 #ifndef __ASSEMBLY__
 
 extern long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE / sizeof(long)];
diff --git a/arch/riscv/include/asm/vmap_stack.h b/arch/riscv/include/asm/vmap_stack.h
new file mode 100644
index 000000000000..3fbf481abf4f
--- /dev/null
+++ b/arch/riscv/include/asm/vmap_stack.h
@@ -0,0 +1,28 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+// Copied from arch/arm64/include/asm/vmap_stack.h.
+#ifndef _ASM_RISCV_VMAP_STACK_H
+#define _ASM_RISCV_VMAP_STACK_H
+
+#include <linux/bug.h>
+#include <linux/gfp.h>
+#include <linux/kconfig.h>
+#include <linux/vmalloc.h>
+#include <linux/pgtable.h>
+#include <asm/thread_info.h>
+
+/*
+ * To ensure that VMAP'd stack overflow detection works correctly, all VMAP'd
+ * stacks need to have the same alignment.
+ */
+static inline unsigned long *arch_alloc_vmap_stack(size_t stack_size, int node)
+{
+	void *p;
+
+	BUILD_BUG_ON(!IS_ENABLED(CONFIG_VMAP_STACK));
+
+	p = __vmalloc_node(stack_size, THREAD_ALIGN, THREADINFO_GFP, node,
+			__builtin_return_address(0));
+	return kasan_reset_tag(p);
+}
+
+#endif /* _ASM_RISCV_VMAP_STACK_H */
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 5f49517cd3a2..426529b84db0 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -332,6 +332,33 @@  ENTRY(ret_from_kernel_thread)
 	tail syscall_exit_to_user_mode
 ENDPROC(ret_from_kernel_thread)
 
+#ifdef CONFIG_IRQ_STACKS
+ENTRY(call_on_stack)
+	/* Create a frame record to save our ra and fp */
+	addi	sp, sp, -RISCV_SZPTR
+	REG_S	ra, (sp)
+	addi	sp, sp, -RISCV_SZPTR
+	REG_S	fp, (sp)
+
+	/* Save sp in fp */
+	move	fp, sp
+
+	/* Move to the new stack and call the function there */
+	li	a3, IRQ_STACK_SIZE
+	add	sp, a1, a3
+	jalr	a2
+
+	/*
+	 * Restore sp from prev fp, and fp, ra from the frame
+	 */
+	move	sp, fp
+	REG_L	fp, (sp)
+	addi	sp, sp, RISCV_SZPTR
+	REG_L	ra, (sp)
+	addi	sp, sp, RISCV_SZPTR
+	ret
+ENDPROC(call_on_stack)
+#endif
 
 /*
  * Integer register context switch
diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
index 24c2e1bd756a..5ad4952203c5 100644
--- a/arch/riscv/kernel/irq.c
+++ b/arch/riscv/kernel/irq.c
@@ -10,6 +10,37 @@ 
 #include <linux/irqchip.h>
 #include <linux/seq_file.h>
 #include <asm/smp.h>
+#include <asm/vmap_stack.h>
+
+#ifdef CONFIG_IRQ_STACKS
+static DEFINE_PER_CPU(ulong *, irq_stack_ptr);
+
+#ifdef CONFIG_VMAP_STACK
+static void init_irq_stacks(void)
+{
+	int cpu;
+	ulong *p;
+
+	for_each_possible_cpu(cpu) {
+		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
+		per_cpu(irq_stack_ptr, cpu) = p;
+	}
+}
+#else
+/* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned. */
+DEFINE_PER_CPU_ALIGNED(ulong [IRQ_STACK_SIZE/sizeof(ulong)], irq_stack);
+
+static void init_irq_stacks(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu)
+		per_cpu(irq_stack_ptr, cpu) = per_cpu(irq_stack, cpu);
+}
+#endif /* CONFIG_VMAP_STACK */
+#else
+static void init_irq_stacks(void) {}
+#endif /* CONFIG_IRQ_STACKS */
 
 int arch_show_interrupts(struct seq_file *p, int prec)
 {
@@ -19,21 +50,34 @@  int arch_show_interrupts(struct seq_file *p, int prec)
 
 void __init init_IRQ(void)
 {
+	init_irq_stacks();
 	irqchip_init();
 	if (!handle_arch_irq)
 		panic("No interrupt controller found.");
 }
 
-asmlinkage void noinstr do_riscv_irq(struct pt_regs *regs)
+static void noinstr handle_riscv_irq(struct pt_regs *regs)
 {
 	struct pt_regs *old_regs;
-	irqentry_state_t state = irqentry_enter(regs);
 
 	irq_enter_rcu();
 	old_regs = set_irq_regs(regs);
 	handle_arch_irq(regs);
 	set_irq_regs(old_regs);
 	irq_exit_rcu();
+}
+
+asmlinkage void noinstr do_riscv_irq(struct pt_regs *regs)
+{
+	irqentry_state_t state = irqentry_enter(regs);
+#ifdef CONFIG_IRQ_STACKS
+	ulong *sp = per_cpu(irq_stack_ptr, smp_processor_id());
+
+	if (on_thread_stack())
+		call_on_stack(regs, sp, handle_riscv_irq, 0);
+	else
+#endif
+		handle_riscv_irq(regs);
 
 	irqentry_exit(regs, state);
 }