diff mbox series

[1/3] arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction

Message ID 20200519121818.14511-2-will@kernel.org (mailing list archive)
State New, archived
Headers show
Series arm64 sigreturn unwinding fixes | expand

Commit Message

Will Deacon May 19, 2020, 12:18 p.m. UTC
For better or worse, GDB relies on the exact instruction sequence in the
VDSO sigreturn trampoline in order to unwind from signals correctly.
Commit 714a8d02ca4d ("arm64: asm: Override SYM_FUNC_START when building
the kernel with BTI") unfortunately added a BTI C instruction to the
start of __kernel_rt_sigreturn, which breaks this check. Thankfully,
it's also not required, since the trampoline is called from a RET
instruction when returning from the signal handler

Remove the unnecessary BTI C instruction from __kernel_rt_sigreturn.

Cc: Dave Martin <dave.martin@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Daniel Kiss <daniel.kiss@arm.com>
Cc: Tamas Zsoldos <tamas.zsoldos@arm.com>
Fixes: 714a8d02ca4d ("arm64: asm: Override SYM_FUNC_START when building the kernel with BTI")
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/kernel/vdso/sigreturn.S | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Mark Brown May 19, 2020, 12:38 p.m. UTC | #1
On Tue, May 19, 2020 at 01:18:16PM +0100, Will Deacon wrote:

> Fixes: 714a8d02ca4d ("arm64: asm: Override SYM_FUNC_START when building the kernel with BTI")

I'd say it's the annotation conversion not this, and also that the
bikeshed would be most fetching in orange.

c91db232da484851 (arm64: vdso: Convert to modern assembler annotations)

> -SYM_FUNC_START(__kernel_rt_sigreturn)
> +/*
> + * GDB relies on being able to identify the sigreturn instruction sequence to
> + * unwind from signal handlers. We cannot, therefore, use SYM_FUNC_START()
> + * here, as it will emit a BTI C instruction and break the unwinder. Thankfully,
> + * this function is only ever called from a RET and so omitting the landing pad
> + * is perfectly fine.
> + */
> +SYM_START(__kernel_rt_sigreturn, SYM_L_GLOBAL, SYM_A_ALIGN)

Shouldn't this be a SYM_CODE_START()?  It's the same thing as above
currently and we'll break an awful lot more if we change what it does in
a way that affects the code, plus the use of CODE basically says the
above - it's a "this is non-standard and we know exactly what we're
doing, don't mess with it" annotation.  If not then it'd be good to
cover that in the comment since otherwise this seems like it's asking
for a cleanup, we shouldn't really have raw SYM_START() in code.

I guess we also ought to annotate the 32 bit sigreturns as CODE too,
though it's academic there without BTI.
Dave Martin May 19, 2020, 1:21 p.m. UTC | #2
On Tue, May 19, 2020 at 01:18:16PM +0100, Will Deacon wrote:
> For better or worse, GDB relies on the exact instruction sequence in the
> VDSO sigreturn trampoline in order to unwind from signals correctly.

Are you sure?  I'm struggling to find the relevant code in gdb.

> Commit 714a8d02ca4d ("arm64: asm: Override SYM_FUNC_START when building
> the kernel with BTI") unfortunately added a BTI C instruction to the
> start of __kernel_rt_sigreturn, which breaks this check. Thankfully,
> it's also not required, since the trampoline is called from a RET
> instruction when returning from the signal handler
> 
> Remove the unnecessary BTI C instruction from __kernel_rt_sigreturn.
> 
> Cc: Dave Martin <dave.martin@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Daniel Kiss <daniel.kiss@arm.com>
> Cc: Tamas Zsoldos <tamas.zsoldos@arm.com>
> Fixes: 714a8d02ca4d ("arm64: asm: Override SYM_FUNC_START when building the kernel with BTI")
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/kernel/vdso/sigreturn.S | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/vdso/sigreturn.S b/arch/arm64/kernel/vdso/sigreturn.S
> index 3fb13b81f780..83ac284dae79 100644
> --- a/arch/arm64/kernel/vdso/sigreturn.S
> +++ b/arch/arm64/kernel/vdso/sigreturn.S
> @@ -15,7 +15,14 @@
>  	.text
>  
>  	nop
> -SYM_FUNC_START(__kernel_rt_sigreturn)
> +/*
> + * GDB relies on being able to identify the sigreturn instruction sequence to
> + * unwind from signal handlers. We cannot, therefore, use SYM_FUNC_START()
> + * here, as it will emit a BTI C instruction and break the unwinder. Thankfully,
> + * this function is only ever called from a RET and so omitting the landing pad
> + * is perfectly fine.
> + */
> +SYM_START(__kernel_rt_sigreturn, SYM_L_GLOBAL, SYM_A_ALIGN)
>  	.cfi_startproc
>  	.cfi_signal_frame
>  	.cfi_def_cfa	x29, 0
> -- 
> 2.26.2.761.g0e0b3e54be-goog
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Dave Martin May 19, 2020, 1:25 p.m. UTC | #3
On Tue, May 19, 2020 at 01:38:43PM +0100, Mark Brown wrote:
> On Tue, May 19, 2020 at 01:18:16PM +0100, Will Deacon wrote:
> 
> > Fixes: 714a8d02ca4d ("arm64: asm: Override SYM_FUNC_START when building the kernel with BTI")
> 
> I'd say it's the annotation conversion not this, and also that the
> bikeshed would be most fetching in orange.
> 
> c91db232da484851 (arm64: vdso: Convert to modern assembler annotations)
> 
> > -SYM_FUNC_START(__kernel_rt_sigreturn)
> > +/*
> > + * GDB relies on being able to identify the sigreturn instruction sequence to
> > + * unwind from signal handlers. We cannot, therefore, use SYM_FUNC_START()
> > + * here, as it will emit a BTI C instruction and break the unwinder. Thankfully,
> > + * this function is only ever called from a RET and so omitting the landing pad
> > + * is perfectly fine.
> > + */
> > +SYM_START(__kernel_rt_sigreturn, SYM_L_GLOBAL, SYM_A_ALIGN)
> 
> Shouldn't this be a SYM_CODE_START()?  It's the same thing as above
> currently and we'll break an awful lot more if we change what it does in
> a way that affects the code, plus the use of CODE basically says the
> above - it's a "this is non-standard and we know exactly what we're
> doing, don't mess with it" annotation.  If not then it'd be good to
> cover that in the comment since otherwise this seems like it's asking
> for a cleanup, we shouldn't really have raw SYM_START() in code.
> 
> I guess we also ought to annotate the 32 bit sigreturns as CODE too,
> though it's academic there without BTI.

Relating to this, we explicitly don't support calls to
__kernel_rt_sigreturn.

Rather, the "ret lr" that jumps here is supposed to be authenticated via
pointer auth in the caller.


If BTI {nothing} allows this while disallowing all BR/BLR then we could
use that (I can't remember what BTI {nothing} is useful for, if anything).

Otherwise, it's less clear what we should have here.


Cheers
---Dave
Will Deacon May 19, 2020, 1:29 p.m. UTC | #4
On Tue, May 19, 2020 at 02:21:01PM +0100, Dave Martin wrote:
> On Tue, May 19, 2020 at 01:18:16PM +0100, Will Deacon wrote:
> > For better or worse, GDB relies on the exact instruction sequence in the
> > VDSO sigreturn trampoline in order to unwind from signals correctly.
> 
> Are you sure?  I'm struggling to find the relevant code in gdb.

It looks pretty damning:

https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/aarch64-linux-tdep.c;h=34ba0d87baaff12f1f9711e777ab15a0a394f59b;hb=HEAD#l361

(and also look at struct tramp_frame).

Will
Mark Brown May 19, 2020, 2:35 p.m. UTC | #5
On Tue, May 19, 2020 at 02:25:38PM +0100, Dave Martin wrote:

> Rather, the "ret lr" that jumps here is supposed to be authenticated via
> pointer auth in the caller.

In which case there was an issue anyway...

> If BTI {nothing} allows this while disallowing all BR/BLR then we could
> use that (I can't remember what BTI {nothing} is useful for, if anything).

> Otherwise, it's less clear what we should have here.

I can't remember anything that distinguishes it from an explicit NOP.
Dave Martin May 19, 2020, 2:55 p.m. UTC | #6
On Tue, May 19, 2020 at 03:35:00PM +0100, Mark Brown wrote:
> On Tue, May 19, 2020 at 02:25:38PM +0100, Dave Martin wrote:
> 
> > Rather, the "ret lr" that jumps here is supposed to be authenticated via
> > pointer auth in the caller.
> 
> In which case there was an issue anyway...

What issue?

> > If BTI {nothing} allows this while disallowing all BR/BLR then we could
> > use that (I can't remember what BTI {nothing} is useful for, if anything).
> 
> > Otherwise, it's less clear what we should have here.
> 
> I can't remember anything that distinguishes it from an explicit NOP.

I think it rejects everything other then fallthrough execution
(BTYPE==0, which includes RET).  I might have misunderstood something
somewhere, but sort of feels like the right thing here.  I never put a
lot of effort into trying to understand BTI {nothing} though.  It
seemed a weird, possibly useless special case.


Of course, if gdb's unwinder does rely on recognising magic instruction
sequences in the sigreturn trampoline even when the vdso has valid
unwind information, we're probably doomed to stick for the current code
forever.

Cheers
---Dave
Will Deacon May 19, 2020, 3:26 p.m. UTC | #7
On Tue, May 19, 2020 at 01:38:43PM +0100, Mark Brown wrote:
> On Tue, May 19, 2020 at 01:18:16PM +0100, Will Deacon wrote:
> 
> > Fixes: 714a8d02ca4d ("arm64: asm: Override SYM_FUNC_START when building the kernel with BTI")
> 
> I'd say it's the annotation conversion not this, and also that the
> bikeshed would be most fetching in orange.
> 
> c91db232da484851 (arm64: vdso: Convert to modern assembler annotations)

I initially had both, but that felt weird so I dropped this one. However,
I'm happy to switch it for v2.

> > -SYM_FUNC_START(__kernel_rt_sigreturn)
> > +/*
> > + * GDB relies on being able to identify the sigreturn instruction sequence to
> > + * unwind from signal handlers. We cannot, therefore, use SYM_FUNC_START()
> > + * here, as it will emit a BTI C instruction and break the unwinder. Thankfully,
> > + * this function is only ever called from a RET and so omitting the landing pad
> > + * is perfectly fine.
> > + */
> > +SYM_START(__kernel_rt_sigreturn, SYM_L_GLOBAL, SYM_A_ALIGN)
> 
> Shouldn't this be a SYM_CODE_START()?  It's the same thing as above
> currently and we'll break an awful lot more if we change what it does in
> a way that affects the code, plus the use of CODE basically says the
> above - it's a "this is non-standard and we know exactly what we're
> doing, don't mess with it" annotation.  If not then it'd be good to
> cover that in the comment since otherwise this seems like it's asking
> for a cleanup, we shouldn't really have raw SYM_START() in code.
> 
> I guess we also ought to annotate the 32 bit sigreturns as CODE too,
> though it's academic there without BTI.

Yes, that's much better, I'll use SYM_CODE_START() and update the compat
code too (I'd forgotten it wasn't an array of hex anymore).

Thanks,

Will
Mark Brown May 19, 2020, 3:42 p.m. UTC | #8
On Tue, May 19, 2020 at 03:55:15PM +0100, Dave Martin wrote:
> On Tue, May 19, 2020 at 03:35:00PM +0100, Mark Brown wrote:
> > On Tue, May 19, 2020 at 02:25:38PM +0100, Dave Martin wrote:

> > > Rather, the "ret lr" that jumps here is supposed to be authenticated via
> > > pointer auth in the caller.

> > In which case there was an issue anyway...

> What issue?

None, I was confused.

> > > If BTI {nothing} allows this while disallowing all BR/BLR then we could
> > > use that (I can't remember what BTI {nothing} is useful for, if anything).

> > > Otherwise, it's less clear what we should have here.

> > I can't remember anything that distinguishes it from an explicit NOP.

> I think it rejects everything other then fallthrough execution
> (BTYPE==0, which includes RET).  I might have misunderstood something

Right, but since BTI only generates an exception when BTYPE != 0 I'm
having trouble differentiating this from a NOP in practical terms.

> somewhere, but sort of feels like the right thing here.  I never put a
> lot of effort into trying to understand BTI {nothing} though.  It
> seemed a weird, possibly useless special case.

That was my read too.
Dave Martin May 20, 2020, 9:48 a.m. UTC | #9
On Tue, May 19, 2020 at 04:42:47PM +0100, Mark Brown wrote:
> On Tue, May 19, 2020 at 03:55:15PM +0100, Dave Martin wrote:
> > On Tue, May 19, 2020 at 03:35:00PM +0100, Mark Brown wrote:
> > > On Tue, May 19, 2020 at 02:25:38PM +0100, Dave Martin wrote:
> 
> > > > Rather, the "ret lr" that jumps here is supposed to be authenticated via
> > > > pointer auth in the caller.
> 
> > > In which case there was an issue anyway...
> 
> > What issue?
> 
> None, I was confused.
> 
> > > > If BTI {nothing} allows this while disallowing all BR/BLR then we could
> > > > use that (I can't remember what BTI {nothing} is useful for, if anything).
> 
> > > > Otherwise, it's less clear what we should have here.
> 
> > > I can't remember anything that distinguishes it from an explicit NOP.
> 
> > I think it rejects everything other then fallthrough execution
> > (BTYPE==0, which includes RET).  I might have misunderstood something
> 
> Right, but since BTI only generates an exception when BTYPE != 0 I'm
> having trouble differentiating this from a NOP in practical terms.

The idea would be that if an attacker could fudge some function pointer
to point at __kernel_rt_sigreturn, attempting to do a call via that
pointer would still trigger a BTI trap.

This vulnerability isn't applicable to return addresses, because the
victim is supposed to sign those before storing them to (attackable)
memory, and authenticate between loading back from memory and doing the
RET.  So the victim can defend itself from that scenario.

> 
> > somewhere, but sort of feels like the right thing here.  I never put a
> > lot of effort into trying to understand BTI {nothing} though.  It
> > seemed a weird, possibly useless special case.
> 
> That was my read too.

And if the gdb doesn't tolerate modification of the exact insn sequence,
we can't do it anyway.  I'd really say that's a bug-like rogue heuristic
in gdb and "not our problem".  But people will moan about regressions
nonetheless.

I was that interested because of the potential use for BTI {nothing}.
I'd have to actually try it out to be 100% sure it works anyway.

Cheers
---Dave
Mark Brown May 20, 2020, 10:46 a.m. UTC | #10
On Wed, May 20, 2020 at 10:48:45AM +0100, Dave Martin wrote:
> On Tue, May 19, 2020 at 04:42:47PM +0100, Mark Brown wrote:
> > On Tue, May 19, 2020 at 03:55:15PM +0100, Dave Martin wrote:

> > > > > If BTI {nothing} allows this while disallowing all BR/BLR then we could
> > > > > use that (I can't remember what BTI {nothing} is useful for, if anything).

> > > > > Otherwise, it's less clear what we should have here.

> > > > I can't remember anything that distinguishes it from an explicit NOP.

> > > I think it rejects everything other then fallthrough execution
> > > (BTYPE==0, which includes RET).  I might have misunderstood something

> > Right, but since BTI only generates an exception when BTYPE != 0 I'm
> > having trouble differentiating this from a NOP in practical terms.

> The idea would be that if an attacker could fudge some function pointer
> to point at __kernel_rt_sigreturn, attempting to do a call via that
> pointer would still trigger a BTI trap.

We'll get a BTI exception no matter what instruction is here so long as
it's not an appropriate BTI landing pad so unless we want to prevent one
being generated there's no need to change the instruction sequence.  Or
perhaps I'm not quite getting the scenario you're thinking of?
Dave Martin May 20, 2020, 11:08 a.m. UTC | #11
On Wed, May 20, 2020 at 11:46:53AM +0100, Mark Brown wrote:
> On Wed, May 20, 2020 at 10:48:45AM +0100, Dave Martin wrote:
> > On Tue, May 19, 2020 at 04:42:47PM +0100, Mark Brown wrote:
> > > On Tue, May 19, 2020 at 03:55:15PM +0100, Dave Martin wrote:
> 
> > > > > > If BTI {nothing} allows this while disallowing all BR/BLR then we could
> > > > > > use that (I can't remember what BTI {nothing} is useful for, if anything).
> 
> > > > > > Otherwise, it's less clear what we should have here.
> 
> > > > > I can't remember anything that distinguishes it from an explicit NOP.
> 
> > > > I think it rejects everything other then fallthrough execution
> > > > (BTYPE==0, which includes RET).  I might have misunderstood something
> 
> > > Right, but since BTI only generates an exception when BTYPE != 0 I'm
> > > having trouble differentiating this from a NOP in practical terms.
> 
> > The idea would be that if an attacker could fudge some function pointer
> > to point at __kernel_rt_sigreturn, attempting to do a call via that
> > pointer would still trigger a BTI trap.
> 
> We'll get a BTI exception no matter what instruction is here so long as
> it's not an appropriate BTI landing pad so unless we want to prevent one
> being generated there's no need to change the instruction sequence.  Or
> perhaps I'm not quite getting the scenario you're thinking of?

Duh, yes.  I guess we're good, then.

Cheers
---Dave
diff mbox series

Patch

diff --git a/arch/arm64/kernel/vdso/sigreturn.S b/arch/arm64/kernel/vdso/sigreturn.S
index 3fb13b81f780..83ac284dae79 100644
--- a/arch/arm64/kernel/vdso/sigreturn.S
+++ b/arch/arm64/kernel/vdso/sigreturn.S
@@ -15,7 +15,14 @@ 
 	.text
 
 	nop
-SYM_FUNC_START(__kernel_rt_sigreturn)
+/*
+ * GDB relies on being able to identify the sigreturn instruction sequence to
+ * unwind from signal handlers. We cannot, therefore, use SYM_FUNC_START()
+ * here, as it will emit a BTI C instruction and break the unwinder. Thankfully,
+ * this function is only ever called from a RET and so omitting the landing pad
+ * is perfectly fine.
+ */
+SYM_START(__kernel_rt_sigreturn, SYM_L_GLOBAL, SYM_A_ALIGN)
 	.cfi_startproc
 	.cfi_signal_frame
 	.cfi_def_cfa	x29, 0