RFC: riscv: save the SR_SUM status over switches

Message ID	20210318151010.100966-1-ben.dooks@codethink.co.uk (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=IhzG=IQ=lists.infradead.org=linux-riscv-bounces+linux-riscv=archiver.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C5CE364F1D From: Ben Dooks <ben.dooks@codethink.co.uk> To: Paul Walmsley <paul.walmsley@sifive.com>, Palmer Dabbelt <palmer@dabbelt.com>, linux-riscv@lists.infradead.org Cc: Terry Hu <kejia.hu@codethink.co.uk>, syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com, Ben Dooks <ben.dooks@codethink.co.uk>, Dmitry Vyukov <dvyukov@google.com>, Javier Jardon <javier.jardon@codethink.co.uk> Subject: [PATCH] RFC: riscv: save the SR_SUM status over switches Date: Thu, 18 Mar 2021 15:10:10 +0000 Message-Id: <20210318151010.100966-1-ben.dooks@codethink.co.uk> MIME-Version: 1.0 Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" <linux-riscv-bounces@lists.infradead.org> Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org
Series	RFC: riscv: save the SR_SUM status over switches \| expand RFC: riscv: save the SR_SUM status over switches

Message ID

20210318151010.100966-1-ben.dooks@codethink.co.uk (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C5CE364F1D
From: Ben Dooks <ben.dooks@codethink.co.uk>
To: Paul Walmsley <paul.walmsley@sifive.com>,
 Palmer Dabbelt <palmer@dabbelt.com>, linux-riscv@lists.infradead.org
Cc: Terry Hu <kejia.hu@codethink.co.uk>,
 syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com,
 Ben Dooks <ben.dooks@codethink.co.uk>, Dmitry Vyukov <dvyukov@google.com>,
 Javier Jardon <javier.jardon@codethink.co.uk>
Subject: [PATCH] RFC: riscv: save the SR_SUM status over switches
Date: Thu, 18 Mar 2021 15:10:10 +0000
Message-Id: <20210318151010.100966-1-ben.dooks@codethink.co.uk>
MIME-Version: 1.0
Precedence: list
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-riscv" <linux-riscv-bounces@lists.infradead.org>
Errors-To: 
 linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org

Series

RFC: riscv: save the SR_SUM status over switches | expand

Commit Message

Ben Dooks March 18, 2021, 3:10 p.m. UTC

When threads/tasks are switched we need to ensure the old execution's
SR_SUM state is saved and the new thread has the old SR_SUM state
restored.

The issue is seen under heavy load especially with the syz-stress tool
running, with crashes as follows in schedule_tail:

Unable to handle kernel access to user memory without uaccess routines at virtual address 000000002749f0d0
Oops [#1]
Modules linked in:
CPU: 1 PID: 4875 Comm: syz-executor.0 Not tainted 5.12.0-rc2-syzkaller-00467-g0d7588ab9ef9 #0
Hardware name: riscv-virtio,qemu (DT)
epc : schedule_tail+0x72/0xb2 kernel/sched/core.c:4264
 ra : task_pid_vnr include/linux/sched.h:1421 [inline]
 ra : schedule_tail+0x70/0xb2 kernel/sched/core.c:4264
epc : ffffffe00008c8b0 ra : ffffffe00008c8ae sp : ffffffe025d17ec0
 gp : ffffffe005d25378 tp : ffffffe00f0d0000 t0 : 0000000000000000
 t1 : 0000000000000001 t2 : 00000000000f4240 s0 : ffffffe025d17ee0
 s1 : 000000002749f0d0 a0 : 000000000000002a a1 : 0000000000000003
 a2 : 1ffffffc0cfac500 a3 : ffffffe0000c80cc a4 : 5ae9db91c19bbe00
 a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffe000082eba
 s2 : 0000000000040000 s3 : ffffffe00eef96c0 s4 : ffffffe022c77fe0
 s5 : 0000000000004000 s6 : ffffffe067d74e00 s7 : ffffffe067d74850
 s8 : ffffffe067d73e18 s9 : ffffffe067d74e00 s10: ffffffe00eef96e8
 s11: 000000ae6cdf8368 t3 : 5ae9db91c19bbe00 t4 : ffffffc4043cafb2
 t5 : ffffffc4043cafba t6 : 0000000000040000
status: 0000000000000120 badaddr: 000000002749f0d0 cause: 000000000000000f
Call Trace:
[<ffffffe00008c8b0>] schedule_tail+0x72/0xb2 kernel/sched/core.c:4264
[<ffffffe000005570>] ret_from_exception+0x0/0x14
Dumping ftrace buffer:
   (ftrace buffer empty)
---[ end trace b5f8f9231dc87dda ]---

The issue comes from the put_user() in schedule_tail (kernel/sched/core.c)
doing the following:

asmlinkage __visible void schedule_tail(struct task_struct *prev)
{
...
        if (current->set_child_tid)
                put_user(task_pid_vnr(current), current->set_child_tid);
...
}

the put_user() macro causes the code sequence to come out as follows:

1:	__enable_user_access()
2:	reg = task_pid_vnr(current);
3:	*current->set_child_tid = reg;
4:	__disable_user_access()

This means the task_pid_vnr() is being called with user-access enabled
which itself is not a good idea, but that is a seperate issue. Here we
have a function that /might/ sleep being called with the SR_SUM and if
it does, then it returns with the SR_SUM flag possibly cleared thus
causing the above abort.

To try and deal with this, and stop the SR_SUM leaking out into other
threads (this has also been tested and see under stress. It can rarely
happen but it /does/ under load) make sure the __switch_to() will save
and restore the SR_SUM flag, and clear it possibly for the next thread
if it does not need it.

Note, test code to be supplied once other checks have been finished.

There may be further issues with the mstatus flags with this, this
can be discussed further once some initial testing has been done.

Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Javier Jardon <javier.jardon@codethink.co.uk>
---
 arch/riscv/include/asm/processor.h | 1 +
 arch/riscv/kernel/asm-offsets.c    | 5 +++++
 arch/riscv/kernel/entry.S          | 7 +++++++
 3 files changed, 13 insertions(+)

Comments

Alexandre Ghiti March 18, 2021, 4:57 p.m. UTC | #1

Hi Ben,

Le 3/18/21 à 11:10 AM, Ben Dooks a écrit :
> When threads/tasks are switched we need to ensure the old execution's
> SR_SUM state is saved and the new thread has the old SR_SUM state
> restored.
> 
> The issue is seen under heavy load especially with the syz-stress tool
> running, with crashes as follows in schedule_tail:
> 
> Unable to handle kernel access to user memory without uaccess routines at virtual address 000000002749f0d0
> Oops [#1]
> Modules linked in:
> CPU: 1 PID: 4875 Comm: syz-executor.0 Not tainted 5.12.0-rc2-syzkaller-00467-g0d7588ab9ef9 #0
> Hardware name: riscv-virtio,qemu (DT)
> epc : schedule_tail+0x72/0xb2 kernel/sched/core.c:4264
>   ra : task_pid_vnr include/linux/sched.h:1421 [inline]
>   ra : schedule_tail+0x70/0xb2 kernel/sched/core.c:4264
> epc : ffffffe00008c8b0 ra : ffffffe00008c8ae sp : ffffffe025d17ec0
>   gp : ffffffe005d25378 tp : ffffffe00f0d0000 t0 : 0000000000000000
>   t1 : 0000000000000001 t2 : 00000000000f4240 s0 : ffffffe025d17ee0
>   s1 : 000000002749f0d0 a0 : 000000000000002a a1 : 0000000000000003
>   a2 : 1ffffffc0cfac500 a3 : ffffffe0000c80cc a4 : 5ae9db91c19bbe00
>   a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffe000082eba
>   s2 : 0000000000040000 s3 : ffffffe00eef96c0 s4 : ffffffe022c77fe0
>   s5 : 0000000000004000 s6 : ffffffe067d74e00 s7 : ffffffe067d74850
>   s8 : ffffffe067d73e18 s9 : ffffffe067d74e00 s10: ffffffe00eef96e8
>   s11: 000000ae6cdf8368 t3 : 5ae9db91c19bbe00 t4 : ffffffc4043cafb2
>   t5 : ffffffc4043cafba t6 : 0000000000040000
> status: 0000000000000120 badaddr: 000000002749f0d0 cause: 000000000000000f
> Call Trace:
> [<ffffffe00008c8b0>] schedule_tail+0x72/0xb2 kernel/sched/core.c:4264
> [<ffffffe000005570>] ret_from_exception+0x0/0x14
> Dumping ftrace buffer:
>     (ftrace buffer empty)
> ---[ end trace b5f8f9231dc87dda ]---
> 
> The issue comes from the put_user() in schedule_tail (kernel/sched/core.c)
> doing the following:
> 
> asmlinkage __visible void schedule_tail(struct task_struct *prev)
> {
> ...
>          if (current->set_child_tid)
>                  put_user(task_pid_vnr(current), current->set_child_tid);
> ...
> }
> 
> the put_user() macro causes the code sequence to come out as follows:
> 
> 1:	__enable_user_access()
> 2:	reg = task_pid_vnr(current);
> 3:	*current->set_child_tid = reg;
> 4:	__disable_user_access()
> 
> This means the task_pid_vnr() is being called with user-access enabled
> which itself is not a good idea, but that is a seperate issue. Here we
> have a function that /might/ sleep being called with the SR_SUM and if
> it does, then it returns with the SR_SUM flag possibly cleared thus
> causing the above abort.
> 
> To try and deal with this, and stop the SR_SUM leaking out into other
> threads (this has also been tested and see under stress. It can rarely
> happen but it /does/ under load) make sure the __switch_to() will save
> and restore the SR_SUM flag, and clear it possibly for the next thread
> if it does not need it.
> 
> Note, test code to be supplied once other checks have been finished.

Very nice, good catch !

> 
> There may be further issues with the mstatus flags with this, this
> can be discussed further once some initial testing has been done.

Yes, that must be discussed, I don't have opinion right now but if this 
is the only bit to save/restore, we may consider other options: disable 
preemption, using __typeof(x) like suggested by Dmitry to avoid being 
scheduled and maybe other options.

> 
> Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com
> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Javier Jardon <javier.jardon@codethink.co.uk>
> ---
>   arch/riscv/include/asm/processor.h | 1 +
>   arch/riscv/kernel/asm-offsets.c    | 5 +++++
>   arch/riscv/kernel/entry.S          | 7 +++++++
>   3 files changed, 13 insertions(+)
> 
> diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
> index 021ed64ee608..e2398c6e7af9 100644
> --- a/arch/riscv/include/asm/processor.h
> +++ b/arch/riscv/include/asm/processor.h
> @@ -35,6 +35,7 @@ struct thread_struct {
>   	unsigned long s[12];	/* s[0]: frame pointer */
>   	struct __riscv_d_ext_state fstate;
>   	unsigned long bad_cause;
> +	unsigned long flags;

I would not call it "flags", I'd rather use "status", named after 
CSR_STATUS.

>   };
>   
>   #define INIT_THREAD {					\
> diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
> index 9ef33346853c..0036570752d9 100644
> --- a/arch/riscv/kernel/asm-offsets.c
> +++ b/arch/riscv/kernel/asm-offsets.c
> @@ -29,6 +29,7 @@ void asm_offsets(void)
>   	OFFSET(TASK_THREAD_S9, task_struct, thread.s[9]);
>   	OFFSET(TASK_THREAD_S10, task_struct, thread.s[10]);
>   	OFFSET(TASK_THREAD_S11, task_struct, thread.s[11]);
> +	OFFSET(TASK_THREAD_FLAGS, task_struct, thread.flags);
>   	OFFSET(TASK_TI_FLAGS, task_struct, thread_info.flags);
>   	OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count);
>   	OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp);
> @@ -172,6 +173,10 @@ void asm_offsets(void)
>   		  offsetof(struct task_struct, thread.s[11])
>   		- offsetof(struct task_struct, thread.ra)
>   	);
> +	DEFINE(TASK_THREAD_FLAGS_RA,
> +		  offsetof(struct task_struct, thread.flags)
> +		- offsetof(struct task_struct, thread.ra)
> +	);
>   
>   	DEFINE(TASK_THREAD_F0_F0,
>   		  offsetof(struct task_struct, thread.fstate.f[0])
> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> index 744f3209c48d..6137f6c2a2ad 100644
> --- a/arch/riscv/kernel/entry.S
> +++ b/arch/riscv/kernel/entry.S
> @@ -417,7 +417,14 @@ ENTRY(__switch_to)
>   	REG_S s9,  TASK_THREAD_S9_RA(a3)
>   	REG_S s10, TASK_THREAD_S10_RA(a3)
>   	REG_S s11, TASK_THREAD_S11_RA(a3)
> +	/* save (and disable the user space access flag) */
> +	li    s0, SR_SUM
> +	csrrc s1, CSR_STATUS, s0

I don't think that clearing SR_SUM is needed here since you restore its 
value later on.

> +	REG_S s1,  TASK_THREAD_FLAGS_RA(a3)
> +
>   	/* Restore context from next->thread */
> +	REG_L s0,  TASK_THREAD_FLAGS_RA(a4)
> +	csrs  CSR_STATUS, s0
>   	REG_L ra,  TASK_THREAD_RA_RA(a4)
>   	REG_L sp,  TASK_THREAD_SP_RA(a4)
>   	REG_L s0,  TASK_THREAD_S0_RA(a4)
> 

Again, very nice finding, well done :)

Alex

Ben Dooks March 18, 2021, 5:42 p.m. UTC | #2

On 18/03/2021 16:57, Alex Ghiti wrote:
> Hi Ben,
> 
> Le 3/18/21 à 11:10 AM, Ben Dooks a écrit :
>> When threads/tasks are switched we need to ensure the old execution's
>> SR_SUM state is saved and the new thread has the old SR_SUM state
>> restored.
>>
>> The issue is seen under heavy load especially with the syz-stress tool
>> running, with crashes as follows in schedule_tail:
>>
>> Unable to handle kernel access to user memory without uaccess routines 
>> at virtual address 000000002749f0d0
>> Oops [#1]
>> Modules linked in:
>> CPU: 1 PID: 4875 Comm: syz-executor.0 Not tainted 
>> 5.12.0-rc2-syzkaller-00467-g0d7588ab9ef9 #0
>> Hardware name: riscv-virtio,qemu (DT)
>> epc : schedule_tail+0x72/0xb2 kernel/sched/core.c:4264
>>   ra : task_pid_vnr include/linux/sched.h:1421 [inline]
>>   ra : schedule_tail+0x70/0xb2 kernel/sched/core.c:4264
>> epc : ffffffe00008c8b0 ra : ffffffe00008c8ae sp : ffffffe025d17ec0
>>   gp : ffffffe005d25378 tp : ffffffe00f0d0000 t0 : 0000000000000000
>>   t1 : 0000000000000001 t2 : 00000000000f4240 s0 : ffffffe025d17ee0
>>   s1 : 000000002749f0d0 a0 : 000000000000002a a1 : 0000000000000003
>>   a2 : 1ffffffc0cfac500 a3 : ffffffe0000c80cc a4 : 5ae9db91c19bbe00
>>   a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffe000082eba
>>   s2 : 0000000000040000 s3 : ffffffe00eef96c0 s4 : ffffffe022c77fe0
>>   s5 : 0000000000004000 s6 : ffffffe067d74e00 s7 : ffffffe067d74850
>>   s8 : ffffffe067d73e18 s9 : ffffffe067d74e00 s10: ffffffe00eef96e8
>>   s11: 000000ae6cdf8368 t3 : 5ae9db91c19bbe00 t4 : ffffffc4043cafb2
>>   t5 : ffffffc4043cafba t6 : 0000000000040000
>> status: 0000000000000120 badaddr: 000000002749f0d0 cause: 
>> 000000000000000f
>> Call Trace:
>> [<ffffffe00008c8b0>] schedule_tail+0x72/0xb2 kernel/sched/core.c:4264
>> [<ffffffe000005570>] ret_from_exception+0x0/0x14
>> Dumping ftrace buffer:
>>     (ftrace buffer empty)
>> ---[ end trace b5f8f9231dc87dda ]---
>>
>> The issue comes from the put_user() in schedule_tail 
>> (kernel/sched/core.c)
>> doing the following:
>>
>> asmlinkage __visible void schedule_tail(struct task_struct *prev)
>> {
>> ...
>>          if (current->set_child_tid)
>>                  put_user(task_pid_vnr(current), current->set_child_tid);
>> ...
>> }
>>
>> the put_user() macro causes the code sequence to come out as follows:
>>
>> 1:    __enable_user_access()
>> 2:    reg = task_pid_vnr(current);
>> 3:    *current->set_child_tid = reg;
>> 4:    __disable_user_access()
>>
>> This means the task_pid_vnr() is being called with user-access enabled
>> which itself is not a good idea, but that is a seperate issue. Here we
>> have a function that /might/ sleep being called with the SR_SUM and if
>> it does, then it returns with the SR_SUM flag possibly cleared thus
>> causing the above abort.
>>
>> To try and deal with this, and stop the SR_SUM leaking out into other
>> threads (this has also been tested and see under stress. It can rarely
>> happen but it /does/ under load) make sure the __switch_to() will save
>> and restore the SR_SUM flag, and clear it possibly for the next thread
>> if it does not need it.
>>
>> Note, test code to be supplied once other checks have been finished.
> 
> Very nice, good catch !

Yes, this has taken about 3 days of work to find (thank you to Codethink
for not making me do more boring work).

I'll try and make a blog post, a talk and some of the tools/patches used
available for people to look at.

So far this patch has been running syz-stress under qemu-riscv64 for
about 2 hours without an issue. My heating bill might be a bit reduced
for today.

>>
>> There may be further issues with the mstatus flags with this, this
>> can be discussed further once some initial testing has been done.
> 
> Yes, that must be discussed, I don't have opinion right now but if this 
> is the only bit to save/restore, we may consider other options: disable 
> preemption, using __typeof(x) like suggested by Dmitry to avoid being 
> scheduled and maybe other options.

I had a go at the typeof, it made a lot of messy kernel warnings and I
am not sure if it will actually fully fix things as gcc may well move
things around anyway.

There really does need to be a fix for the amount of code that might
be executed under put_user(), but I think the correct thing to do is
to ensure that the flags are saved over the switch_to().

I'm not sure if disabling pre-emption is great either as there are now
kernel sections where the access is disabled for using the "unsafe"
versions of the user space access.

>>
>> Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com
>> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
>> Cc: Dmitry Vyukov <dvyukov@google.com>
>> Cc: Javier Jardon <javier.jardon@codethink.co.uk>
>> ---
>>   arch/riscv/include/asm/processor.h | 1 +
>>   arch/riscv/kernel/asm-offsets.c    | 5 +++++
>>   arch/riscv/kernel/entry.S          | 7 +++++++
>>   3 files changed, 13 insertions(+)
>>
>> diff --git a/arch/riscv/include/asm/processor.h 
>> b/arch/riscv/include/asm/processor.h
>> index 021ed64ee608..e2398c6e7af9 100644
>> --- a/arch/riscv/include/asm/processor.h
>> +++ b/arch/riscv/include/asm/processor.h
>> @@ -35,6 +35,7 @@ struct thread_struct {
>>       unsigned long s[12];    /* s[0]: frame pointer */
>>       struct __riscv_d_ext_state fstate;
>>       unsigned long bad_cause;
>> +    unsigned long flags;
> 
> I would not call it "flags", I'd rather use "status", named after 
> CSR_STATUS.

Ok, good idea. Not sure if it should be status or csr_status.

> 
>>   };
>>   #define INIT_THREAD {                    \
>> diff --git a/arch/riscv/kernel/asm-offsets.c 
>> b/arch/riscv/kernel/asm-offsets.c
>> index 9ef33346853c..0036570752d9 100644
>> --- a/arch/riscv/kernel/asm-offsets.c
>> +++ b/arch/riscv/kernel/asm-offsets.c
>> @@ -29,6 +29,7 @@ void asm_offsets(void)
>>       OFFSET(TASK_THREAD_S9, task_struct, thread.s[9]);
>>       OFFSET(TASK_THREAD_S10, task_struct, thread.s[10]);
>>       OFFSET(TASK_THREAD_S11, task_struct, thread.s[11]);
>> +    OFFSET(TASK_THREAD_FLAGS, task_struct, thread.flags);
>>       OFFSET(TASK_TI_FLAGS, task_struct, thread_info.flags);
>>       OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, 
>> thread_info.preempt_count);
>>       OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp);
>> @@ -172,6 +173,10 @@ void asm_offsets(void)
>>             offsetof(struct task_struct, thread.s[11])
>>           - offsetof(struct task_struct, thread.ra)
>>       );
>> +    DEFINE(TASK_THREAD_FLAGS_RA,
>> +          offsetof(struct task_struct, thread.flags)
>> +        - offsetof(struct task_struct, thread.ra)
>> +    );
>>       DEFINE(TASK_THREAD_F0_F0,
>>             offsetof(struct task_struct, thread.fstate.f[0])
>> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
>> index 744f3209c48d..6137f6c2a2ad 100644
>> --- a/arch/riscv/kernel/entry.S
>> +++ b/arch/riscv/kernel/entry.S
>> @@ -417,7 +417,14 @@ ENTRY(__switch_to)
>>       REG_S s9,  TASK_THREAD_S9_RA(a3)
>>       REG_S s10, TASK_THREAD_S10_RA(a3)
>>       REG_S s11, TASK_THREAD_S11_RA(a3)
>> +    /* save (and disable the user space access flag) */
>> +    li    s0, SR_SUM
>> +    csrrc s1, CSR_STATUS, s0
> 
> I don't think that clearing SR_SUM is needed here since you restore its 
> value later on.

The restore I think requires the flag to have been cleared.

>> +    REG_S s1,  TASK_THREAD_FLAGS_RA(a3)
>> +
>>       /* Restore context from next->thread */
>> +    REG_L s0,  TASK_THREAD_FLAGS_RA(a4)
>> +    csrs  CSR_STATUS, s0
>>       REG_L ra,  TASK_THREAD_RA_RA(a4)
>>       REG_L sp,  TASK_THREAD_SP_RA(a4)
>>       REG_L s0,  TASK_THREAD_S0_RA(a4)
>>
> 
> Again, very nice finding, well done :)
> 
> Alex
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
>

Ben Dooks March 22, 2021, 12:05 p.m. UTC | #3

On 18/03/2021 17:42, Ben Dooks wrote:
> On 18/03/2021 16:57, Alex Ghiti wrote:
>> Hi Ben,
>>
>> Le 3/18/21 à 11:10 AM, Ben Dooks a écrit :
>>> When threads/tasks are switched we need to ensure the old execution's
>>> SR_SUM state is saved and the new thread has the old SR_SUM state
>>> restored.
>>>
>>> The issue is seen under heavy load especially with the syz-stress tool
>>> running, with crashes as follows in schedule_tail:
>>>
>>> Unable to handle kernel access to user memory without uaccess 
>>> routines at virtual address 000000002749f0d0
>>> Oops [#1]
>>> Modules linked in:

[snip]

I've had syz-stress running over the weekend now with no further
occurrence of the fault. I'll put the put_user() patch out now and
will further investigate flags over __switch_to()

diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 021ed64ee608..e2398c6e7af9 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -35,6 +35,7 @@  struct thread_struct {
 	unsigned long s[12];	/* s[0]: frame pointer */
 	struct __riscv_d_ext_state fstate;
 	unsigned long bad_cause;
+	unsigned long flags;
 };
 
 #define INIT_THREAD {					\
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index 9ef33346853c..0036570752d9 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -29,6 +29,7 @@  void asm_offsets(void)
 	OFFSET(TASK_THREAD_S9, task_struct, thread.s[9]);
 	OFFSET(TASK_THREAD_S10, task_struct, thread.s[10]);
 	OFFSET(TASK_THREAD_S11, task_struct, thread.s[11]);
+	OFFSET(TASK_THREAD_FLAGS, task_struct, thread.flags);
 	OFFSET(TASK_TI_FLAGS, task_struct, thread_info.flags);
 	OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count);
 	OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp);
@@ -172,6 +173,10 @@  void asm_offsets(void)
 		  offsetof(struct task_struct, thread.s[11])
 		- offsetof(struct task_struct, thread.ra)
 	);
+	DEFINE(TASK_THREAD_FLAGS_RA,
+		  offsetof(struct task_struct, thread.flags)
+		- offsetof(struct task_struct, thread.ra)
+	);
 
 	DEFINE(TASK_THREAD_F0_F0,
 		  offsetof(struct task_struct, thread.fstate.f[0])
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 744f3209c48d..6137f6c2a2ad 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -417,7 +417,14 @@  ENTRY(__switch_to)
 	REG_S s9,  TASK_THREAD_S9_RA(a3)
 	REG_S s10, TASK_THREAD_S10_RA(a3)
 	REG_S s11, TASK_THREAD_S11_RA(a3)
+	/* save (and disable the user space access flag) */
+	li    s0, SR_SUM
+	csrrc s1, CSR_STATUS, s0
+	REG_S s1,  TASK_THREAD_FLAGS_RA(a3)
+
 	/* Restore context from next->thread */
+	REG_L s0,  TASK_THREAD_FLAGS_RA(a4)
+	csrs  CSR_STATUS, s0
 	REG_L ra,  TASK_THREAD_RA_RA(a4)
 	REG_L sp,  TASK_THREAD_SP_RA(a4)
 	REG_L s0,  TASK_THREAD_S0_RA(a4)

RFC: riscv: save the SR_SUM status over switches

Commit Message

Comments

Patch