diff mbox series

riscv: fix a nasty sigreturn bug...

Message ID YU0wDzeS/jXwkAca@zeniv-ca.linux.org.uk (mailing list archive)
State New, archived
Headers show
Series riscv: fix a nasty sigreturn bug... | expand

Commit Message

Al Viro Sept. 24, 2021, 1:55 a.m. UTC
riscv has an equivalent of arm bug fixed by 653d48b22166; if signal
gets caught by an interrupt that hits when we have the right value
in a0 (-513), *and* another signal gets delivered upon sigreturn()
(e.g. included into the blocked mask for the first signal and posted
while the handler had been running), the syscall restart logics will
see regs->cause equal to EXC_SYSCALL (we are in a syscall, after all)
and a0 already restored to its original value (-513, which happens to
be -ERESTARTNOINTR) and assume that we need to apply the usual
syscall restart logics.
    
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---

Comments

Al Viro Sept. 2, 2022, 12:13 a.m. UTC | #1
Ping?  Does anybody have objections?  AFAICS, the bug is still
there...

On Fri, Sep 24, 2021 at 01:55:27AM +0000, Al Viro wrote:
> riscv has an equivalent of arm bug fixed by 653d48b22166; if signal
> gets caught by an interrupt that hits when we have the right value
> in a0 (-513), *and* another signal gets delivered upon sigreturn()
> (e.g. included into the blocked mask for the first signal and posted
> while the handler had been running), the syscall restart logics will
> see regs->cause equal to EXC_SYSCALL (we are in a syscall, after all)
> and a0 already restored to its original value (-513, which happens to
> be -ERESTARTNOINTR) and assume that we need to apply the usual
> syscall restart logics.
>     
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
> diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
> index c2d5ecbe55264..f8fb85dc94b7a 100644
> --- a/arch/riscv/kernel/signal.c
> +++ b/arch/riscv/kernel/signal.c
> @@ -121,6 +121,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
>  	if (restore_altstack(&frame->uc.uc_stack))
>  		goto badframe;
>  
> +	regs->cause = -1UL;
> +
>  	return regs->a0;
>  
>  badframe:
Andrew Jones Sept. 2, 2022, 9:22 a.m. UTC | #2
On Fri, Sep 24, 2021 at 01:55:27AM +0000, Al Viro wrote:
> riscv has an equivalent of arm bug fixed by 653d48b22166; if signal
> gets caught by an interrupt that hits when we have the right value
> in a0 (-513), *and* another signal gets delivered upon sigreturn()
> (e.g. included into the blocked mask for the first signal and posted
> while the handler had been running), the syscall restart logics will
> see regs->cause equal to EXC_SYSCALL (we are in a syscall, after all)
> and a0 already restored to its original value (-513, which happens to
> be -ERESTARTNOINTR) and assume that we need to apply the usual
> syscall restart logics.
>     
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
> diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
> index c2d5ecbe55264..f8fb85dc94b7a 100644
> --- a/arch/riscv/kernel/signal.c
> +++ b/arch/riscv/kernel/signal.c
> @@ -121,6 +121,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
>  	if (restore_altstack(&frame->uc.uc_stack))
>  		goto badframe;
>  
> +	regs->cause = -1UL;
> +
>  	return regs->a0;
>  
>  badframe:
>

This looks good to me based on what other architectures do.

For example, arm64 does

  rt_sigreturn
    restore_sigframe
      forget_syscall
        regs->syscallno = NO_SYSCALL

  which results in do_signal avoiding syscall restarting

And x86 does

  rt_sigreturn
    restore_sigcontext
      regs->orig_ax = -1

  where its handle_signal only restarts syscalls when regs->orig_ax != -1

So, for riscv, where in do_signal and handle_signal syscall restarting
is avoided when regs->cause != EXC_SYSCALL and it's common to set cause
to -1 to avoid it, then it makes sense to set regs->cause != EXEC_SYSCALL
in rt_sigreturn or in restore_sigcontext, which rt_sigreturn calls, as
well.

So the only question I have is whether or not the cause assignment
is better in restore_sigcontext() like other architectures? At least,
since rt_sigreturn is the only caller of restore_sigcontext, it can't
break anything putting it there atm...

Anyway,

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

BTW, I ran the testcase from 653d48b22166 with the asm modified for
riscv for a while over QEMU. It didn't reproduce, but I suppose that
doesn't mean much.

Thanks,
drew
Al Viro Sept. 2, 2022, 5:59 p.m. UTC | #3
On Fri, Sep 02, 2022 at 11:22:45AM +0200, Andrew Jones wrote:

> So, for riscv, where in do_signal and handle_signal syscall restarting
> is avoided when regs->cause != EXC_SYSCALL and it's common to set cause
> to -1 to avoid it, then it makes sense to set regs->cause != EXEC_SYSCALL
> in rt_sigreturn or in restore_sigcontext, which rt_sigreturn calls, as
> well.
> 
> So the only question I have is whether or not the cause assignment
> is better in restore_sigcontext() like other architectures? At least,
> since rt_sigreturn is the only caller of restore_sigcontext, it can't
> break anything putting it there atm...

	The only reason for doing that in restore_sigcontext() is that for
old architectures there'd been separate sigreturn(2) and rt_sigreturn(2).
Doing that in the helper shared by both avoided duplication; since
there's no such thing on riscv...

	Matter of taste, really - I have a slight preference for doing that
closer to the syscall surface, but it's no more than that.
Palmer Dabbelt Sept. 15, 2022, 6:48 p.m. UTC | #4
> Ping?  Does anybody have objections?  AFAICS, the bug is still
> there...

Sorry, something's gone off the rails with email and this thread doesn't 
show up in my inbox (not even any of the replies).  I tried to patch 
together this reply manually so hopefully it works.

This is on fixes, thanks -- trying to debug this one would have been a 
nightmare.

> On Fri, Sep 24, 2021 at 01:55:27AM +0000, Al Viro wrote:
>> riscv has an equivalent of arm bug fixed by 653d48b22166; if signal
>> gets caught by an interrupt that hits when we have the right value
>> in a0 (-513), *and* another signal gets delivered upon sigreturn()
>> (e.g. included into the blocked mask for the first signal and posted
>> while the handler had been running), the syscall restart logics will
>> see regs->cause equal to EXC_SYSCALL (we are in a syscall, after all)
>> and a0 already restored to its original value (-513, which happens to
>> be -ERESTARTNOINTR) and assume that we need to apply the usual
>> syscall restart logics.
>>     
>> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
>> ---
>> diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
>> index c2d5ecbe55264..f8fb85dc94b7a 100644
>> --- a/arch/riscv/kernel/signal.c
>> +++ b/arch/riscv/kernel/signal.c
>> @@ -121,6 +121,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
>>  	if (restore_altstack(&frame->uc.uc_stack))
>>  		goto badframe;
>>  
>> +	regs->cause = -1UL;
>> +
>>  	return regs->a0;
>>  
>>  badframe:
diff mbox series

Patch

diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
index c2d5ecbe55264..f8fb85dc94b7a 100644
--- a/arch/riscv/kernel/signal.c
+++ b/arch/riscv/kernel/signal.c
@@ -121,6 +121,8 @@  SYSCALL_DEFINE0(rt_sigreturn)
 	if (restore_altstack(&frame->uc.uc_stack))
 		goto badframe;
 
+	regs->cause = -1UL;
+
 	return regs->a0;
 
 badframe: