diff mbox series

RFC: riscv: evaluate put_user() arg before enabling user access

Message ID 20210318224135.134344-1-ben.dooks@codethink.co.uk (mailing list archive)
State New, archived
Headers show
Series RFC: riscv: evaluate put_user() arg before enabling user access | expand

Commit Message

Ben Dooks March 18, 2021, 10:41 p.m. UTC
The <asm/uaccess.h> header has a problem with
put_user(a, ptr) if the 'a' is not a simple
variable, such as a function. This can lead
to the compiler producing code as so:

1:	enable_user_access()
2:	evaluate 'a'
3:	put 'a' to 'ptr'
4:	disable_user_acess()

The issue is that 'a' is now being evaluated
with the user memory protections disabled. So
we try and force the evaulation by assinging
'x' to __val at the start, and hoping the
compiler barriers in enable_user_access()
do the job of ordering step 2 before step 1.

This has shown up in a bug where 'a' sleeps
and thus schedules out and loses the SR_SUM
flag. This isn't sufficient to fully fix, but
should reduce the window of opportunity.

Cc: Arnd Bergman <arnd@arndb.de>
---
 arch/riscv/include/asm/uaccess.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Arnd Bergmann March 18, 2021, 10:48 p.m. UTC | #1
On Thu, Mar 18, 2021 at 11:41 PM Ben Dooks <ben.dooks@codethink.co.uk> wrote:
>
> The <asm/uaccess.h> header has a problem with
> put_user(a, ptr) if the 'a' is not a simple
> variable, such as a function. This can lead
> to the compiler producing code as so:
>
> 1:      enable_user_access()
> 2:      evaluate 'a'
> 3:      put 'a' to 'ptr'
> 4:      disable_user_acess()
>
> The issue is that 'a' is now being evaluated
> with the user memory protections disabled. So
> we try and force the evaulation by assinging
> 'x' to __val at the start, and hoping the
> compiler barriers in enable_user_access()
> do the job of ordering step 2 before step 1.
>
> This has shown up in a bug where 'a' sleeps
> and thus schedules out and loses the SR_SUM
> flag. This isn't sufficient to fully fix, but
> should reduce the window of opportunity.
>
> Cc: Arnd Bergman <arnd@arndb.de>

Reviewed-by: Arnd Bergman <arnd@arndb.de>

Note: your Signed-off-by seems to be missing.

> ---
>  arch/riscv/include/asm/uaccess.h | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
> index 824b2c9da75b..7bf90d462ec9 100644
> --- a/arch/riscv/include/asm/uaccess.h
> +++ b/arch/riscv/include/asm/uaccess.h
> @@ -306,7 +306,10 @@ do {                                                               \
>   * data types like structures or arrays.
>   *
>   * @ptr must have pointer-to-simple-variable type, and @x must be assignable
> - * to the result of dereferencing @ptr.
> + * to the result of dereferencing @ptr. The @x is copied inside the macro
> + * to avoid code re-ordering where @x gets evaulated within the block that
> + * enables user-space access (thus possibly bypassing some of the protection
> + * this feautre provides).
>   *
>   * Caller must check the pointer with access_ok() before calling this
>   * function.
> @@ -316,12 +319,13 @@ do {                                                              \
>  #define __put_user(x, ptr)                                     \
>  ({                                                             \
>         __typeof__(*(ptr)) __user *__gu_ptr = (ptr);            \
> +       __typeof__(*__gu_ptr) __val = (x);                      \
>         long __pu_err = 0;                                      \

Side note: We should really change the first typeof to an __auto_type
here (and on all other architectures), but that's a separate change
and not strictly a bugfix.

        Arnd
Ben Dooks March 18, 2021, 11:46 p.m. UTC | #2
On 18/03/2021 22:48, Arnd Bergmann wrote:
> On Thu, Mar 18, 2021 at 11:41 PM Ben Dooks <ben.dooks@codethink.co.uk> wrote:
>>
>> The <asm/uaccess.h> header has a problem with
>> put_user(a, ptr) if the 'a' is not a simple
>> variable, such as a function. This can lead
>> to the compiler producing code as so:
>>
>> 1:      enable_user_access()
>> 2:      evaluate 'a'
>> 3:      put 'a' to 'ptr'
>> 4:      disable_user_acess()
>>
>> The issue is that 'a' is now being evaluated
>> with the user memory protections disabled. So
>> we try and force the evaulation by assinging
>> 'x' to __val at the start, and hoping the
>> compiler barriers in enable_user_access()
>> do the job of ordering step 2 before step 1.
>>
>> This has shown up in a bug where 'a' sleeps
>> and thus schedules out and loses the SR_SUM
>> flag. This isn't sufficient to fully fix, but
>> should reduce the window of opportunity.
>>
>> Cc: Arnd Bergman <arnd@arndb.de>
> 
> Reviewed-by: Arnd Bergman <arnd@arndb.de>
> 
> Note: your Signed-off-by seems to be missing.

Sorry, forgot as was intending smaller RFC release.

Thanks for the help sorting out the compile issues

I need to review the patch description anyway and make it flow
closer to 73 columns instead of the rather truncated version it is.
Christoph Hellwig March 19, 2021, 1:05 p.m. UTC | #3
On Thu, Mar 18, 2021 at 10:41:35PM +0000, Ben Dooks wrote:
> The <asm/uaccess.h> header has a problem with
> put_user(a, ptr) if the 'a' is not a simple
> variable, such as a function. This can lead
> to the compiler producing code as so:

Nit: your commit log seeems to truncate lines after 50 chars, you can
and should use almost 1.5 as much.

>   * @ptr must have pointer-to-simple-variable type, and @x must be assignable
> - * to the result of dereferencing @ptr.
> + * to the result of dereferencing @ptr. The @x is copied inside the macro
> + * to avoid code re-ordering where @x gets evaulated within the block that
> + * enables user-space access (thus possibly bypassing some of the protection
> + * this feautre provides).

Well, hopefully the compiler is smart enought to not actually copy.
So we should probably talk about evaluating the argument here.

>  #define __put_user(x, ptr)					\
>  ({								\
>  	__typeof__(*(ptr)) __user *__gu_ptr = (ptr);		\
> +	__typeof__(*__gu_ptr) __val = (x);			\
>  	long __pu_err = 0;					\
>  								\
>  	__chk_user_ptr(__gu_ptr);				\
>  								\
>  	__enable_user_access();					\
> -	__put_user_nocheck(x, __gu_ptr, __pu_err);		\
> +	__put_user_nocheck(__val, __gu_ptr, __pu_err);		\
>  	__disable_user_access();				\

It looks like __get_user needs the same treatment.
Ben Dooks March 19, 2021, 2:19 p.m. UTC | #4
On 19/03/2021 13:05, Christoph Hellwig wrote:
> On Thu, Mar 18, 2021 at 10:41:35PM +0000, Ben Dooks wrote:
>> The <asm/uaccess.h> header has a problem with
>> put_user(a, ptr) if the 'a' is not a simple
>> variable, such as a function. This can lead
>> to the compiler producing code as so:
> 
> Nit: your commit log seeems to truncate lines after 50 chars, you can
> and should use almost 1.5 as much.

Thanks, noted this once I'd re-read the patch. I have a few
other minor bits to test and to credit Arnd with helping out
after failing to get the first attempt to compile.

>>    * @ptr must have pointer-to-simple-variable type, and @x must be assignable
>> - * to the result of dereferencing @ptr.
>> + * to the result of dereferencing @ptr. The @x is copied inside the macro
>> + * to avoid code re-ordering where @x gets evaulated within the block that
>> + * enables user-space access (thus possibly bypassing some of the protection
>> + * this feautre provides).
> 
> Well, hopefully the compiler is smart enought to not actually copy.
> So we should probably talk about evaluating the argument here.
> 
>>   #define __put_user(x, ptr)					\
>>   ({								\
>>   	__typeof__(*(ptr)) __user *__gu_ptr = (ptr);		\
>> +	__typeof__(*__gu_ptr) __val = (x);			\
>>   	long __pu_err = 0;					\
>>   								\
>>   	__chk_user_ptr(__gu_ptr);				\
>>   								\
>>   	__enable_user_access();					\
>> -	__put_user_nocheck(x, __gu_ptr, __pu_err);		\
>> +	__put_user_nocheck(__val, __gu_ptr, __pu_err);		\
>>   	__disable_user_access();				\
> 
> It looks like __get_user needs the same treatment.

I will check that, then again I don't think people do anything
that would be an issue. We caught this from the put_user() in the
schedule_tail() code which causes the pid fetch function to be
called within the __enable_user_access().
Alexandre Ghiti March 19, 2021, 3:03 p.m. UTC | #5
Le 3/18/21 à 6:41 PM, Ben Dooks a écrit :
> The <asm/uaccess.h> header has a problem with
> put_user(a, ptr) if the 'a' is not a simple
> variable, such as a function. This can lead
> to the compiler producing code as so:
> 
> 1:	enable_user_access()
> 2:	evaluate 'a'
> 3:	put 'a' to 'ptr'
> 4:	disable_user_acess()
> 
> The issue is that 'a' is now being evaluated
> with the user memory protections disabled. So
> we try and force the evaulation by assinging
> 'x' to __val at the start, and hoping the
> compiler barriers in enable_user_access()
> do the job of ordering step 2 before step 1.
> 
> This has shown up in a bug where 'a' sleeps
> and thus schedules out and loses the SR_SUM
> flag. This isn't sufficient to fully fix, but
> should reduce the window of opportunity.

I would say this patch is enough to fix the issue because it only 
happens when 'a' *explicitly schedules* when in 
__enable_user_access()/__disable_user_access(). Otherwise, I see 2 cases:

- user memory is correctly mapped and nothing stops the current process.
- an exception (interrupt or trap) is triggered: in those cases, the 
exception path correctly saves and restores SR_SUM.

> 
> Cc: Arnd Bergman <arnd@arndb.de>
> ---
>   arch/riscv/include/asm/uaccess.h | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
> index 824b2c9da75b..7bf90d462ec9 100644
> --- a/arch/riscv/include/asm/uaccess.h
> +++ b/arch/riscv/include/asm/uaccess.h
> @@ -306,7 +306,10 @@ do {								\
>    * data types like structures or arrays.
>    *
>    * @ptr must have pointer-to-simple-variable type, and @x must be assignable
> - * to the result of dereferencing @ptr.
> + * to the result of dereferencing @ptr. The @x is copied inside the macro
> + * to avoid code re-ordering where @x gets evaulated within the block that
> + * enables user-space access (thus possibly bypassing some of the protection
> + * this feautre provides).
>    *
>    * Caller must check the pointer with access_ok() before calling this
>    * function.
> @@ -316,12 +319,13 @@ do {								\
>   #define __put_user(x, ptr)					\
>   ({								\
>   	__typeof__(*(ptr)) __user *__gu_ptr = (ptr);		\
> +	__typeof__(*__gu_ptr) __val = (x);			\
>   	long __pu_err = 0;					\
>   								\
>   	__chk_user_ptr(__gu_ptr);				\
>   								\
>   	__enable_user_access();					\
> -	__put_user_nocheck(x, __gu_ptr, __pu_err);		\
> +	__put_user_nocheck(__val, __gu_ptr, __pu_err);		\
>   	__disable_user_access();				\
>   								\
>   	__pu_err;						\
>
Ben Dooks March 19, 2021, 3:09 p.m. UTC | #6
On 19/03/2021 15:03, Alex Ghiti wrote:
> Le 3/18/21 à 6:41 PM, Ben Dooks a écrit :
>> The <asm/uaccess.h> header has a problem with
>> put_user(a, ptr) if the 'a' is not a simple
>> variable, such as a function. This can lead
>> to the compiler producing code as so:
>>
>> 1:    enable_user_access()
>> 2:    evaluate 'a'
>> 3:    put 'a' to 'ptr'
>> 4:    disable_user_acess()
>>
>> The issue is that 'a' is now being evaluated
>> with the user memory protections disabled. So
>> we try and force the evaulation by assinging
>> 'x' to __val at the start, and hoping the
>> compiler barriers in enable_user_access()
>> do the job of ordering step 2 before step 1.
>>
>> This has shown up in a bug where 'a' sleeps
>> and thus schedules out and loses the SR_SUM
>> flag. This isn't sufficient to fully fix, but
>> should reduce the window of opportunity.
> 
> I would say this patch is enough to fix the issue because it only 
> happens when 'a' *explicitly schedules* when in 
> __enable_user_access()/__disable_user_access(). Otherwise, I see 2 cases:
> 
> - user memory is correctly mapped and nothing stops the current process.
> - an exception (interrupt or trap) is triggered: in those cases, the 
> exception path correctly saves and restores SR_SUM.

This fixes part of the other issue.

I did point out in the other email there could be longer cases
where the protections are disabled. The saving of the flags over
switch_to() is still necessary.

Also, I am not sure if this will guarantee ordering. It does
seem to fix it for the cases I checked

>>
>> Cc: Arnd Bergman <arnd@arndb.de>
>> ---
>>   arch/riscv/include/asm/uaccess.h | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/riscv/include/asm/uaccess.h 
>> b/arch/riscv/include/asm/uaccess.h
>> index 824b2c9da75b..7bf90d462ec9 100644
>> --- a/arch/riscv/include/asm/uaccess.h
>> +++ b/arch/riscv/include/asm/uaccess.h
>> @@ -306,7 +306,10 @@ do {                                \
>>    * data types like structures or arrays.
>>    *
>>    * @ptr must have pointer-to-simple-variable type, and @x must be 
>> assignable
>> - * to the result of dereferencing @ptr.
>> + * to the result of dereferencing @ptr. The @x is copied inside the 
>> macro
>> + * to avoid code re-ordering where @x gets evaulated within the block 
>> that
>> + * enables user-space access (thus possibly bypassing some of the 
>> protection
>> + * this feautre provides).
>>    *
>>    * Caller must check the pointer with access_ok() before calling this
>>    * function.
>> @@ -316,12 +319,13 @@ do {                                \
>>   #define __put_user(x, ptr)                    \
>>   ({                                \
>>       __typeof__(*(ptr)) __user *__gu_ptr = (ptr);        \
>> +    __typeof__(*__gu_ptr) __val = (x);            \
>>       long __pu_err = 0;                    \
>>                                   \
>>       __chk_user_ptr(__gu_ptr);                \
>>                                   \
>>       __enable_user_access();                    \
>> -    __put_user_nocheck(x, __gu_ptr, __pu_err);        \
>> +    __put_user_nocheck(__val, __gu_ptr, __pu_err);        \
>>       __disable_user_access();                \
>>                                   \
>>       __pu_err;                        \
>>
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
>
Alexandre Ghiti March 19, 2021, 4:12 p.m. UTC | #7
Le 3/19/21 à 11:09 AM, Ben Dooks a écrit :
> On 19/03/2021 15:03, Alex Ghiti wrote:
>> Le 3/18/21 à 6:41 PM, Ben Dooks a écrit :
>>> The <asm/uaccess.h> header has a problem with
>>> put_user(a, ptr) if the 'a' is not a simple
>>> variable, such as a function. This can lead
>>> to the compiler producing code as so:
>>>
>>> 1:    enable_user_access()
>>> 2:    evaluate 'a'
>>> 3:    put 'a' to 'ptr'
>>> 4:    disable_user_acess()
>>>
>>> The issue is that 'a' is now being evaluated
>>> with the user memory protections disabled. So
>>> we try and force the evaulation by assinging
>>> 'x' to __val at the start, and hoping the
>>> compiler barriers in enable_user_access()
>>> do the job of ordering step 2 before step 1.
>>>
>>> This has shown up in a bug where 'a' sleeps
>>> and thus schedules out and loses the SR_SUM
>>> flag. This isn't sufficient to fully fix, but
>>> should reduce the window of opportunity.
>>
>> I would say this patch is enough to fix the issue because it only 
>> happens when 'a' *explicitly schedules* when in 
>> __enable_user_access()/__disable_user_access(). Otherwise, I see 2 cases:
>>
>> - user memory is correctly mapped and nothing stops the current process.
>> - an exception (interrupt or trap) is triggered: in those cases, the 
>> exception path correctly saves and restores SR_SUM.
> 
> This fixes part of the other issue.
> 
> I did point out in the other email there could be longer cases
> where the protections are disabled. The saving of the flags over
> switch_to() is still necessary.

I can't find your explanation, could you elaborate a bit more here on 
why this fix is not enough ?

Thanks !

> 
> Also, I am not sure if this will guarantee ordering. It does
> seem to fix it for the cases I checked
> 
>>>
>>> Cc: Arnd Bergman <arnd@arndb.de>
>>> ---
>>>   arch/riscv/include/asm/uaccess.h | 8 ++++++--
>>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/riscv/include/asm/uaccess.h 
>>> b/arch/riscv/include/asm/uaccess.h
>>> index 824b2c9da75b..7bf90d462ec9 100644
>>> --- a/arch/riscv/include/asm/uaccess.h
>>> +++ b/arch/riscv/include/asm/uaccess.h
>>> @@ -306,7 +306,10 @@ do {                                \
>>>    * data types like structures or arrays.
>>>    *
>>>    * @ptr must have pointer-to-simple-variable type, and @x must be 
>>> assignable
>>> - * to the result of dereferencing @ptr.
>>> + * to the result of dereferencing @ptr. The @x is copied inside the 
>>> macro
>>> + * to avoid code re-ordering where @x gets evaulated within the 
>>> block that
>>> + * enables user-space access (thus possibly bypassing some of the 
>>> protection
>>> + * this feautre provides).
>>>    *
>>>    * Caller must check the pointer with access_ok() before calling this
>>>    * function.
>>> @@ -316,12 +319,13 @@ do {                                \
>>>   #define __put_user(x, ptr)                    \
>>>   ({                                \
>>>       __typeof__(*(ptr)) __user *__gu_ptr = (ptr);        \
>>> +    __typeof__(*__gu_ptr) __val = (x);            \
>>>       long __pu_err = 0;                    \
>>>                                   \
>>>       __chk_user_ptr(__gu_ptr);                \
>>>                                   \
>>>       __enable_user_access();                    \
>>> -    __put_user_nocheck(x, __gu_ptr, __pu_err);        \
>>> +    __put_user_nocheck(__val, __gu_ptr, __pu_err);        \
>>>       __disable_user_access();                \
>>>                                   \
>>>       __pu_err;                        \
>>>
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>>
> 
>
Ben Dooks March 19, 2021, 9:56 p.m. UTC | #8
On 19/03/2021 16:12, Alex Ghiti wrote:
> Le 3/19/21 à 11:09 AM, Ben Dooks a écrit :
>> On 19/03/2021 15:03, Alex Ghiti wrote:
>>> Le 3/18/21 à 6:41 PM, Ben Dooks a écrit :
>>>> The <asm/uaccess.h> header has a problem with
>>>> put_user(a, ptr) if the 'a' is not a simple
>>>> variable, such as a function. This can lead
>>>> to the compiler producing code as so:
>>>>
>>>> 1:    enable_user_access()
>>>> 2:    evaluate 'a'
>>>> 3:    put 'a' to 'ptr'
>>>> 4:    disable_user_acess()
>>>>
>>>> The issue is that 'a' is now being evaluated
>>>> with the user memory protections disabled. So
>>>> we try and force the evaulation by assinging
>>>> 'x' to __val at the start, and hoping the
>>>> compiler barriers in enable_user_access()
>>>> do the job of ordering step 2 before step 1.
>>>>
>>>> This has shown up in a bug where 'a' sleeps
>>>> and thus schedules out and loses the SR_SUM
>>>> flag. This isn't sufficient to fully fix, but
>>>> should reduce the window of opportunity.
>>>
>>> I would say this patch is enough to fix the issue because it only 
>>> happens when 'a' *explicitly schedules* when in 
>>> __enable_user_access()/__disable_user_access(). Otherwise, I see 2 
>>> cases:
>>>
>>> - user memory is correctly mapped and nothing stops the current process.
>>> - an exception (interrupt or trap) is triggered: in those cases, the 
>>> exception path correctly saves and restores SR_SUM.
>>
>> This fixes part of the other issue.
>>
>> I did point out in the other email there could be longer cases
>> where the protections are disabled. The saving of the flags over
>> switch_to() is still necessary.
> 
> I can't find your explanation, could you elaborate a bit more here on 
> why this fix is not enough ?


I would have to check if this current applies to riscv, but there is
code that does the following (fs/select.c does this):

                 if (!user_read_access_begin(from, sizeof(*from)))
                         return -EFAULT;
                 unsafe_get_user(to->p, &from->p, Efault);
                 unsafe_get_user(to->size, &from->size, Efault);
                 user_read_access_end();

My argument for fixing with both is:

- cover more than the put_user() case
- try and avoid any future bug
- ensure we do not leak SR_SUM elsewhere

I may also have a quick check to see if we don't also leak other
flags during these swaps. It might be we should save and restore
all flags.

I do see that this fix is going to hit a good proportion of the
cases we've seen so far. I could try and run a stress test with
just this in over the weekend (so far syz-stress has been running
for over 24hrs with minimal issues)
diff mbox series

Patch

diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
index 824b2c9da75b..7bf90d462ec9 100644
--- a/arch/riscv/include/asm/uaccess.h
+++ b/arch/riscv/include/asm/uaccess.h
@@ -306,7 +306,10 @@  do {								\
  * data types like structures or arrays.
  *
  * @ptr must have pointer-to-simple-variable type, and @x must be assignable
- * to the result of dereferencing @ptr.
+ * to the result of dereferencing @ptr. The @x is copied inside the macro
+ * to avoid code re-ordering where @x gets evaulated within the block that
+ * enables user-space access (thus possibly bypassing some of the protection
+ * this feautre provides).
  *
  * Caller must check the pointer with access_ok() before calling this
  * function.
@@ -316,12 +319,13 @@  do {								\
 #define __put_user(x, ptr)					\
 ({								\
 	__typeof__(*(ptr)) __user *__gu_ptr = (ptr);		\
+	__typeof__(*__gu_ptr) __val = (x);			\
 	long __pu_err = 0;					\
 								\
 	__chk_user_ptr(__gu_ptr);				\
 								\
 	__enable_user_access();					\
-	__put_user_nocheck(x, __gu_ptr, __pu_err);		\
+	__put_user_nocheck(__val, __gu_ptr, __pu_err);		\
 	__disable_user_access();				\
 								\
 	__pu_err;						\