diff mbox

[RFC] arm64: Fix __addr_ok and __range_ok macros

Message ID 1394059289-3972-1-git-send-email-cov@codeaurora.org (mailing list archive)
State New, archived
Headers show

Commit Message

Christopher Covington March 5, 2014, 10:41 p.m. UTC
Without this, the following scenario is incorrectly determined
to be invalid.

addr 0x7f_ffffe000 size 8192 addr_limit 0x80_00000000

This behavior was observed while trying to vmsplice the stack
as part of a CRIU dump of a process.

Signed-off-by: Christopher Covington <cov@codeaurora.org>
---
 arch/arm64/include/asm/uaccess.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Michael S. Tsirkin March 6, 2014, 8:20 a.m. UTC | #1
On Wed, Mar 05, 2014 at 05:41:28PM -0500, Christopher Covington wrote:
> Without this, the following scenario is incorrectly determined
> to be invalid.
> 
> addr 0x7f_ffffe000 size 8192 addr_limit 0x80_00000000
> 
> This behavior was observed while trying to vmsplice the stack
> as part of a CRIU dump of a process.
> 
> Signed-off-by: Christopher Covington <cov@codeaurora.org>
> ---
>  arch/arm64/include/asm/uaccess.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index edb3d5c..9309024 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -66,12 +66,12 @@ static inline void set_fs(mm_segment_t fs)
>  #define segment_eq(a,b)	((a) == (b))
>  
>  /*
> - * Return 1 if addr < current->addr_limit, 0 otherwise.
> + * Return 1 if addr <= current->addr_limit, 0 otherwise.
>   */
>  #define __addr_ok(addr)							\
>  ({									\
>  	unsigned long flag;						\
> -	asm("cmp %1, %0; cset %0, lo"					\
> +	asm("cmp %1, %0; cset %0, ls"					\
>  		: "=&r" (flag)						\
>  		: "r" (addr), "0" (current_thread_info()->addr_limit)	\
>  		: "cc");						\


BTW can this use mov %0, #0 like arch/arm/include/asm/uaccess.h does?
Would make it more portable ...


> @@ -83,7 +83,7 @@ static inline void set_fs(mm_segment_t fs)
>   * Returns 1 if the range is valid, 0 otherwise.
>   *
>   * This is equivalent to the following test:
> - * (u65)addr + (u65)size < (u65)current->addr_limit
> + * (u65)addr + (u65)size <= current->addr_limit
>   *
>   * This needs 65-bit arithmetic.
>   */
> @@ -91,7 +91,7 @@ static inline void set_fs(mm_segment_t fs)
>  ({									\
>  	unsigned long flag, roksum;					\
>  	__chk_user_ptr(addr);						\
> -	asm("adds %1, %1, %3; ccmp %1, %4, #2, cc; cset %0, cc"		\
> +	asm("adds %1, %1, %3; ccmp %1, %4, #3, cc; cset %0, ls"		\
>  		: "=&r" (flag), "=&r" (roksum)				\
>  		: "1" (addr), "Ir" (size),				\
>  		  "r" (current_thread_info()->addr_limit)		\
> -- 
> Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by the Linux Foundation.
Will Deacon March 6, 2014, 4:08 p.m. UTC | #2
On Thu, Mar 06, 2014 at 08:20:23AM +0000, Michael S. Tsirkin wrote:
> On Wed, Mar 05, 2014 at 05:41:28PM -0500, Christopher Covington wrote:
> > Without this, the following scenario is incorrectly determined
> > to be invalid.
> > 
> > addr 0x7f_ffffe000 size 8192 addr_limit 0x80_00000000
> > 
> > This behavior was observed while trying to vmsplice the stack
> > as part of a CRIU dump of a process.
> > 
> > Signed-off-by: Christopher Covington <cov@codeaurora.org>
> > ---
> >  arch/arm64/include/asm/uaccess.h | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> > index edb3d5c..9309024 100644
> > --- a/arch/arm64/include/asm/uaccess.h
> > +++ b/arch/arm64/include/asm/uaccess.h
> > @@ -66,12 +66,12 @@ static inline void set_fs(mm_segment_t fs)
> >  #define segment_eq(a,b)	((a) == (b))
> >  
> >  /*
> > - * Return 1 if addr < current->addr_limit, 0 otherwise.
> > + * Return 1 if addr <= current->addr_limit, 0 otherwise.
> >   */
> >  #define __addr_ok(addr)							\
> >  ({									\
> >  	unsigned long flag;						\
> > -	asm("cmp %1, %0; cset %0, lo"					\
> > +	asm("cmp %1, %0; cset %0, ls"					\
> >  		: "=&r" (flag)						\
> >  		: "r" (addr), "0" (current_thread_info()->addr_limit)	\
> >  		: "cc");						\

I don't think this is correct, since __addr_ok will now return true for
TASK_SIZE_64.

> BTW can this use mov %0, #0 like arch/arm/include/asm/uaccess.h does?
> Would make it more portable ...

How/why should this be made portable?

> > @@ -83,7 +83,7 @@ static inline void set_fs(mm_segment_t fs)
> >   * Returns 1 if the range is valid, 0 otherwise.
> >   *
> >   * This is equivalent to the following test:
> > - * (u65)addr + (u65)size < (u65)current->addr_limit
> > + * (u65)addr + (u65)size <= current->addr_limit
> >   *
> >   * This needs 65-bit arithmetic.
> >   */
> > @@ -91,7 +91,7 @@ static inline void set_fs(mm_segment_t fs)
> >  ({									\
> >  	unsigned long flag, roksum;					\
> >  	__chk_user_ptr(addr);						\
> > -	asm("adds %1, %1, %3; ccmp %1, %4, #2, cc; cset %0, cc"		\
> > +	asm("adds %1, %1, %3; ccmp %1, %4, #3, cc; cset %0, ls"		\
> >  		: "=&r" (flag), "=&r" (roksum)				\
> >  		: "1" (addr), "Ir" (size),				\
> >  		  "r" (current_thread_info()->addr_limit)		\

Can't you just pass current_thread_info()->addr_limit) - 1 here and be done
with it?

Will
Christopher Covington March 7, 2014, 1:22 p.m. UTC | #3
Hi Michael,

Thanks for the comments.

On 03/06/2014 03:20 AM, Michael S. Tsirkin wrote:
> On Wed, Mar 05, 2014 at 05:41:28PM -0500, Christopher Covington wrote:
>> Without this, the following scenario is incorrectly determined
>> to be invalid.
>>
>> addr 0x7f_ffffe000 size 8192 addr_limit 0x80_00000000
>>
>> This behavior was observed while trying to vmsplice the stack
>> as part of a CRIU dump of a process.
>>
>> Signed-off-by: Christopher Covington <cov@codeaurora.org>
>> ---
>>  arch/arm64/include/asm/uaccess.h | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
>> index edb3d5c..9309024 100644
>> --- a/arch/arm64/include/asm/uaccess.h
>> +++ b/arch/arm64/include/asm/uaccess.h
>> @@ -66,12 +66,12 @@ static inline void set_fs(mm_segment_t fs)
>>  #define segment_eq(a,b)	((a) == (b))
>>  
>>  /*
>> - * Return 1 if addr < current->addr_limit, 0 otherwise.
>> + * Return 1 if addr <= current->addr_limit, 0 otherwise.
>>   */
>>  #define __addr_ok(addr)							\
>>  ({									\
>>  	unsigned long flag;						\
>> -	asm("cmp %1, %0; cset %0, lo"					\
>> +	asm("cmp %1, %0; cset %0, ls"					\
>>  		: "=&r" (flag)						\
>>  		: "r" (addr), "0" (current_thread_info()->addr_limit)	\
>>  		: "cc");						\
> 
> 
> BTW can this use mov %0, #0 like arch/arm/include/asm/uaccess.h does?

The A32 implementation uses "movlo", a conditional move instruction. My
reading of section 3.2 Conditional Instructions of the ARMv8 ISA overview [1]
and other documentation has led me to believe conditional move instructions as
such are not available in A64, hence the choice of "cset".

1.
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.genc010197a/index.html

Christopher
Catalin Marinas March 13, 2014, 11:20 a.m. UTC | #4
On Wed, Mar 05, 2014 at 10:41:28PM +0000, Christopher Covington wrote:
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -66,12 +66,12 @@ static inline void set_fs(mm_segment_t fs)
>  #define segment_eq(a,b)	((a) == (b))
>  
>  /*
> - * Return 1 if addr < current->addr_limit, 0 otherwise.
> + * Return 1 if addr <= current->addr_limit, 0 otherwise.
>   */
>  #define __addr_ok(addr)							\
>  ({									\
>  	unsigned long flag;						\
> -	asm("cmp %1, %0; cset %0, lo"					\
> +	asm("cmp %1, %0; cset %0, ls"					\
>  		: "=&r" (flag)						\
>  		: "r" (addr), "0" (current_thread_info()->addr_limit)	\
>  		: "cc");						\

As Will said, this doesn't look right. Why do you need TASK_SIZE_64 to
be valid?

> @@ -83,7 +83,7 @@ static inline void set_fs(mm_segment_t fs)
>   * Returns 1 if the range is valid, 0 otherwise.
>   *
>   * This is equivalent to the following test:
> - * (u65)addr + (u65)size < (u65)current->addr_limit
> + * (u65)addr + (u65)size <= current->addr_limit
>   *
>   * This needs 65-bit arithmetic.
>   */
> @@ -91,7 +91,7 @@ static inline void set_fs(mm_segment_t fs)
>  ({									\
>  	unsigned long flag, roksum;					\
>  	__chk_user_ptr(addr);						\
> -	asm("adds %1, %1, %3; ccmp %1, %4, #2, cc; cset %0, cc"		\
> +	asm("adds %1, %1, %3; ccmp %1, %4, #3, cc; cset %0, ls"		\
>  		: "=&r" (flag), "=&r" (roksum)				\
>  		: "1" (addr), "Ir" (size),				\
>  		  "r" (current_thread_info()->addr_limit)		\

Just trying to understand: if adds does not set the C flag, we go on and
do the ccmp. If addr + size <= addr_limit, "cset ls" sets the flag
variable. If addr + size actually sets the C flag, we need to make sure
that "cset ls" doesn't trigger, which would mean to set C flag and clear
Z flag. So why do you change the ccmp flags from #2 to #3? It looks to
me like #2 is enough.
Christopher Covington March 13, 2014, 1:41 p.m. UTC | #5
Hi Catalin, Will,

Thanks for your feedback. I must admit I'm out of my depth here, so I just
posted what I had, hoping to solicit comments like what you all have kindly
provided.

On 03/13/2014 07:20 AM, Catalin Marinas wrote:
> On Wed, Mar 05, 2014 at 10:41:28PM +0000, Christopher Covington wrote:
>> --- a/arch/arm64/include/asm/uaccess.h
>> +++ b/arch/arm64/include/asm/uaccess.h
>> @@ -66,12 +66,12 @@ static inline void set_fs(mm_segment_t fs)
>>  #define segment_eq(a,b)	((a) == (b))
>>  
>>  /*
>> - * Return 1 if addr < current->addr_limit, 0 otherwise.
>> + * Return 1 if addr <= current->addr_limit, 0 otherwise.
>>   */
>>  #define __addr_ok(addr)							\
>>  ({									\
>>  	unsigned long flag;						\
>> -	asm("cmp %1, %0; cset %0, lo"					\
>> +	asm("cmp %1, %0; cset %0, ls"					\
>>  		: "=&r" (flag)						\
>>  		: "r" (addr), "0" (current_thread_info()->addr_limit)	\
>>  		: "cc");						\
> 
> As Will said, this doesn't look right. Why do you need TASK_SIZE_64 to
> be valid?

I didn't encounter a case where this was necessary. I was just wondering if a
change was needed for one macro, might it be needed for another? I'm now
convinced that the answer is no.

>> @@ -83,7 +83,7 @@ static inline void set_fs(mm_segment_t fs)
>>   * Returns 1 if the range is valid, 0 otherwise.
>>   *
>>   * This is equivalent to the following test:
>> - * (u65)addr + (u65)size < (u65)current->addr_limit
>> + * (u65)addr + (u65)size <= current->addr_limit
>>   *
>>   * This needs 65-bit arithmetic.
>>   */
>> @@ -91,7 +91,7 @@ static inline void set_fs(mm_segment_t fs)
>>  ({									\
>>  	unsigned long flag, roksum;					\
>>  	__chk_user_ptr(addr);						\
>> -	asm("adds %1, %1, %3; ccmp %1, %4, #2, cc; cset %0, cc"		\
>> +	asm("adds %1, %1, %3; ccmp %1, %4, #3, cc; cset %0, ls"		\
>>  		: "=&r" (flag), "=&r" (roksum)				\
>>  		: "1" (addr), "Ir" (size),				\
>>  		  "r" (current_thread_info()->addr_limit)		\
> 
> Just trying to understand: if adds does not set the C flag, we go on and
> do the ccmp. If addr + size <= addr_limit, "cset ls" sets the flag
> variable. If addr + size actually sets the C flag, we need to make sure
> that "cset ls" doesn't trigger, which would mean to set C flag and clear
> Z flag. So why do you change the ccmp flags from #2 to #3? It looks to
> me like #2 is enough.

#2 is indeed sufficient. I'll respin using it.

I think Will's suggested approach could also work but I figure since I've
taken the time to understand the assembly I might as well fix the problem
there rather than adding another step in the calculation for developers and
compilers to parse. (I don't know if this code is performance critical, but I
nevertheless wanted to see how the compiler handled Will's approach.
Unfortunately my initial implementation resulted in unaligned opcode errors
and I haven't yet dug in.)

Thanks,
Christopher
Catalin Marinas March 13, 2014, 3:53 p.m. UTC | #6
On Thu, Mar 13, 2014 at 01:41:01PM +0000, Christopher Covington wrote:
> On 03/13/2014 07:20 AM, Catalin Marinas wrote:
> > On Wed, Mar 05, 2014 at 10:41:28PM +0000, Christopher Covington wrote:
> >> @@ -83,7 +83,7 @@ static inline void set_fs(mm_segment_t fs)
> >>   * Returns 1 if the range is valid, 0 otherwise.
> >>   *
> >>   * This is equivalent to the following test:
> >> - * (u65)addr + (u65)size < (u65)current->addr_limit
> >> + * (u65)addr + (u65)size <= current->addr_limit
> >>   *
> >>   * This needs 65-bit arithmetic.
> >>   */
> >> @@ -91,7 +91,7 @@ static inline void set_fs(mm_segment_t fs)
> >>  ({									\
> >>  	unsigned long flag, roksum;					\
> >>  	__chk_user_ptr(addr);						\
> >> -	asm("adds %1, %1, %3; ccmp %1, %4, #2, cc; cset %0, cc"		\
> >> +	asm("adds %1, %1, %3; ccmp %1, %4, #3, cc; cset %0, ls"		\
> >>  		: "=&r" (flag), "=&r" (roksum)				\
> >>  		: "1" (addr), "Ir" (size),				\
> >>  		  "r" (current_thread_info()->addr_limit)		\
> > 
> > Just trying to understand: if adds does not set the C flag, we go on and
> > do the ccmp. If addr + size <= addr_limit, "cset ls" sets the flag
> > variable. If addr + size actually sets the C flag, we need to make sure
> > that "cset ls" doesn't trigger, which would mean to set C flag and clear
> > Z flag. So why do you change the ccmp flags from #2 to #3? It looks to
> > me like #2 is enough.
> 
> #2 is indeed sufficient. I'll respin using it.
> 
> I think Will's suggested approach could also work but I figure since I've
> taken the time to understand the assembly I might as well fix the problem
> there rather than adding another step in the calculation for developers and
> compilers to parse. (I don't know if this code is performance critical, but I
> nevertheless wanted to see how the compiler handled Will's approach.
> Unfortunately my initial implementation resulted in unaligned opcode errors
> and I haven't yet dug in.)

If it's only one condition change, I would prefer the inline asm fix. I
haven't done any benchmarks with a C-only implementation to assess the
impact.

For __addr_ok() I think the compiler should be good enough as we don't
need 65-bit arithmetics but we can leave it as it is.
diff mbox

Patch

diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index edb3d5c..9309024 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -66,12 +66,12 @@  static inline void set_fs(mm_segment_t fs)
 #define segment_eq(a,b)	((a) == (b))
 
 /*
- * Return 1 if addr < current->addr_limit, 0 otherwise.
+ * Return 1 if addr <= current->addr_limit, 0 otherwise.
  */
 #define __addr_ok(addr)							\
 ({									\
 	unsigned long flag;						\
-	asm("cmp %1, %0; cset %0, lo"					\
+	asm("cmp %1, %0; cset %0, ls"					\
 		: "=&r" (flag)						\
 		: "r" (addr), "0" (current_thread_info()->addr_limit)	\
 		: "cc");						\
@@ -83,7 +83,7 @@  static inline void set_fs(mm_segment_t fs)
  * Returns 1 if the range is valid, 0 otherwise.
  *
  * This is equivalent to the following test:
- * (u65)addr + (u65)size < (u65)current->addr_limit
+ * (u65)addr + (u65)size <= current->addr_limit
  *
  * This needs 65-bit arithmetic.
  */
@@ -91,7 +91,7 @@  static inline void set_fs(mm_segment_t fs)
 ({									\
 	unsigned long flag, roksum;					\
 	__chk_user_ptr(addr);						\
-	asm("adds %1, %1, %3; ccmp %1, %4, #2, cc; cset %0, cc"		\
+	asm("adds %1, %1, %3; ccmp %1, %4, #3, cc; cset %0, ls"		\
 		: "=&r" (flag), "=&r" (roksum)				\
 		: "1" (addr), "Ir" (size),				\
 		  "r" (current_thread_info()->addr_limit)		\