Message ID | 1607678131-20347-1-git-send-email-maninder1.s@samsung.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1] arm64/entry.S: check for stack overflow in el1 case only | expand |
On Fri, Dec 11, 2020 at 02:45:31PM +0530, Maninder Singh wrote: > current code checks for sp bit flip in all exceptions, > but only el1 exceptions requires this. el0 can not enter > into stack overflow case directly. > > it will improve performance for el0 exceptions and interrupts. > > Signed-off-by: Maninder Singh <maninder1.s@samsung.com> > Signed-off-by: Vaneet Narang <v.narang@samsung.com> I did consider doing this at the time Ard and I wrote the overflow detection, but there was no measureable impact on the workloads that I tested, and it seemed worthwhile to have this as a sanity check in case the SP was somehow corrupted (and to avoid any surprizing differences between the EL0 and EL1 entry paths). When you say "it will improve performance for el0 exceptions and interrupts", do you have a workload where this has a measureable impact, or was this found by inspection? Unless this is causing a real issue, I'd prefer to leave it as-is for now. Thanks, Mark. > --- > arch/arm64/kernel/entry.S | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S > index 2a93fa5..cad8faf 100644 > --- a/arch/arm64/kernel/entry.S > +++ b/arch/arm64/kernel/entry.S > @@ -77,6 +77,7 @@ alternative_else_nop_endif > > sub sp, sp, #S_FRAME_SIZE > #ifdef CONFIG_VMAP_STACK > + .if \el == 1 > /* > * Test whether the SP has overflowed, without corrupting a GPR. > * Task and IRQ stacks are aligned so that SP & (1 << THREAD_SHIFT) > @@ -118,6 +119,7 @@ alternative_else_nop_endif > /* We were already on the overflow stack. Restore sp/x0 and carry on. */ > sub sp, sp, x0 > mrs x0, tpidrro_el0 > + .endif > #endif > b el\()\el\()_\label > .endm > -- > 1.9.1 >
On Thu, Jan 07, 2021 at 11:29:03AM +0000, Mark Rutland wrote: > On Fri, Dec 11, 2020 at 02:45:31PM +0530, Maninder Singh wrote: > > current code checks for sp bit flip in all exceptions, > > but only el1 exceptions requires this. el0 can not enter > > into stack overflow case directly. > > > > it will improve performance for el0 exceptions and interrupts. > > > > Signed-off-by: Maninder Singh <maninder1.s@samsung.com> > > Signed-off-by: Vaneet Narang <v.narang@samsung.com> > > I did consider doing this at the time Ard and I wrote the overflow > detection, but there was no measureable impact on the workloads that I > tested, and it seemed worthwhile to have this as a sanity check in case > the SP was somehow corrupted (and to avoid any surprizing differences > between the EL0 and EL1 entry paths). > > When you say "it will improve performance for el0 exceptions and > interrupts", do you have a workload where this has a measureable impact, > or was this found by inspection? Unless this is causing a real issue, > I'd prefer to leave it as-is for now. Maninder -- please could you follow up on Mark's question? Will
Hi Mark, Will On Thu, Jan 07, 2021 at 11:29:03AM +0000, Mark Rutland wrote: >> On Fri, Dec 11, 2020 at 02:45:31PM +0530, Maninder Singh wrote: >> > current code checks for sp bit flip in all exceptions, >> > but only el1 exceptions requires this. el0 can not enter >> > into stack overflow case directly. >> > >> > it will improve performance for el0 exceptions and interrupts. >> > >> > Signed-off-by: Maninder Singh <maninder1.s@samsung.com> >> > Signed-off-by: Vaneet Narang <v.narang@samsung.com> >> >> I did consider doing this at the time Ard and I wrote the overflow >> detection, but there was no measureable impact on the workloads that I >> tested, and it seemed worthwhile to have this as a sanity check in case >> the SP was somehow corrupted (and to avoid any surprizing differences >> between the EL0 and EL1 entry paths). >> > >> When you say "it will improve performance for el0 exceptions and >> interrupts", do you have a workload where this has a measureable impact, >> or was this found by inspection? Unless this is causing a real issue, >> I'd prefer to leave it as-is for now. > We have not measured performance with any tool because as you said its not measurable, but we think if we can remove some instructions then it will be better, thats why suggested this change. And in el0 there is no chance of overflow of sp so that 5 instructions can be avoided. We tried this on our setup because we were changing some design for VMAP_STACK in our kernel for some more enhancement, so that code was little much and we avoided that part in our local kernel for el0. Thanks, Maninder Singh
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 2a93fa5..cad8faf 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -77,6 +77,7 @@ alternative_else_nop_endif sub sp, sp, #S_FRAME_SIZE #ifdef CONFIG_VMAP_STACK + .if \el == 1 /* * Test whether the SP has overflowed, without corrupting a GPR. * Task and IRQ stacks are aligned so that SP & (1 << THREAD_SHIFT) @@ -118,6 +119,7 @@ alternative_else_nop_endif /* We were already on the overflow stack. Restore sp/x0 and carry on. */ sub sp, sp, x0 mrs x0, tpidrro_el0 + .endif #endif b el\()\el\()_\label .endm