diff mbox

arm64: Implement 1-,2- byte smp_load_acquire and smp_store_release

Message ID 1429544753-4120-1-git-send-email-a.ryabinin@samsung.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andrey Ryabinin April 20, 2015, 3:45 p.m. UTC
commit 47933ad41a86 ("arch: Introduce smp_load_acquire(), smp_store_release()")
allowed only 4- and 8-byte smp_load_acquire, smp_store_release.
So 1- and 2-byte cases weren't implemented in arm64.
Later commit 536fa402221f ("compiler: Allow 1- and 2-byte smp_load_acquire()
				and smp_store_release()")
allowed to use 1 and 2 byte smp_load_acquire and smp_store_release
by adjusting the definition of __native_word().
However, 1-,2- byte cases in arm64 version left unimplemented.

Commit 8053871d0f7f ("smp: Fix smp_call_function_single_async() locking")
started to use smp_load_acquire() to load 2-bytes csd->flags.
That crashes arm64 kernel during the boot.

Implement 1,2 byte cases in arm64's  smp_load_acquire()
and smp_store_release() to fix this.

Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
---
 arch/arm64/include/asm/barrier.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Will Deacon April 20, 2015, 3:48 p.m. UTC | #1
Hi Andrey,

On Mon, Apr 20, 2015 at 04:45:53PM +0100, Andrey Ryabinin wrote:
> commit 47933ad41a86 ("arch: Introduce smp_load_acquire(), smp_store_release()")
> allowed only 4- and 8-byte smp_load_acquire, smp_store_release.
> So 1- and 2-byte cases weren't implemented in arm64.
> Later commit 536fa402221f ("compiler: Allow 1- and 2-byte smp_load_acquire()
> 				and smp_store_release()")
> allowed to use 1 and 2 byte smp_load_acquire and smp_store_release
> by adjusting the definition of __native_word().
> However, 1-,2- byte cases in arm64 version left unimplemented.
> 
> Commit 8053871d0f7f ("smp: Fix smp_call_function_single_async() locking")
> started to use smp_load_acquire() to load 2-bytes csd->flags.
> That crashes arm64 kernel during the boot.
> 
> Implement 1,2 byte cases in arm64's  smp_load_acquire()
> and smp_store_release() to fix this.
> 
> Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>

I already have an equivalent patch queued in the arm64/fixes branch[1]. I'll
send a pull shortly.

Will

[1]
https://git.kernel.org/cgit/linux/kernel/git/arm64/linux.git/log/?h=fixes/core

> ---
>  arch/arm64/include/asm/barrier.h | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index a5abb00..71f19c4 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -65,6 +65,14 @@ do {									\
>  do {									\
>  	compiletime_assert_atomic_type(*p);				\
>  	switch (sizeof(*p)) {						\
> +	case 1:								\
> +		asm volatile ("stlrb %w1, %0"				\
> +				: "=Q" (*p) : "r" (v) : "memory");	\
> +		break;							\
> +	case 2:								\
> +		asm volatile ("stlrh %w1, %0"				\
> +				: "=Q" (*p) : "r" (v) : "memory");	\
> +		break;							\
>  	case 4:								\
>  		asm volatile ("stlr %w1, %0"				\
>  				: "=Q" (*p) : "r" (v) : "memory");	\
> @@ -81,6 +89,14 @@ do {									\
>  	typeof(*p) ___p1;						\
>  	compiletime_assert_atomic_type(*p);				\
>  	switch (sizeof(*p)) {						\
> +	case 1:								\
> +		asm volatile ("ldarb %w0, %1"				\
> +			: "=r" (___p1) : "Q" (*p) : "memory");		\
> +		break;							\
> +	case 2:								\
> +		asm volatile ("ldarh %w0, %1"				\
> +			: "=r" (___p1) : "Q" (*p) : "memory");		\
> +		break;							\
>  	case 4:								\
>  		asm volatile ("ldar %w0, %1"				\
>  			: "=r" (___p1) : "Q" (*p) : "memory");		\
> -- 
> 2.3.5
>
Linus Torvalds April 20, 2015, 4:08 p.m. UTC | #2
On Mon, Apr 20, 2015 at 8:45 AM, Andrey Ryabinin <a.ryabinin@samsung.com> wrote:
>
> Commit 8053871d0f7f ("smp: Fix smp_call_function_single_async() locking")
> started to use smp_load_acquire() to load 2-bytes csd->flags.
> That crashes arm64 kernel during the boot.

Ahh. I don't disagree with your patch, but we should just make
csd->flags be an "int" regardless of this issue.

There is absolutely no reason to try to use 16-bit words. It doesn't
save us any space (alignment), and it's just a *bad* idea to depend on
any kind of atomicity of 16-bit entities. Bytes, yes. Words, yes. But
16-bit half-words? There's no point.

So I'll make it "unsigned int" instead of "u16".

But as said, the ARM64 patch definitely is a good idea regardless.

                          Linus
Paul E. McKenney April 20, 2015, 4:08 p.m. UTC | #3
On Mon, Apr 20, 2015 at 06:45:53PM +0300, Andrey Ryabinin wrote:
> commit 47933ad41a86 ("arch: Introduce smp_load_acquire(), smp_store_release()")
> allowed only 4- and 8-byte smp_load_acquire, smp_store_release.
> So 1- and 2-byte cases weren't implemented in arm64.
> Later commit 536fa402221f ("compiler: Allow 1- and 2-byte smp_load_acquire()
> 				and smp_store_release()")
> allowed to use 1 and 2 byte smp_load_acquire and smp_store_release
> by adjusting the definition of __native_word().
> However, 1-,2- byte cases in arm64 version left unimplemented.
> 
> Commit 8053871d0f7f ("smp: Fix smp_call_function_single_async() locking")
> started to use smp_load_acquire() to load 2-bytes csd->flags.
> That crashes arm64 kernel during the boot.
> 
> Implement 1,2 byte cases in arm64's  smp_load_acquire()
> and smp_store_release() to fix this.
> 
> Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>

I am introducing a similar smp_load_acquire() case in rcutorture to
replace use of explicit memory barriers, so thank you!  ;-)

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  arch/arm64/include/asm/barrier.h | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index a5abb00..71f19c4 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -65,6 +65,14 @@ do {									\
>  do {									\
>  	compiletime_assert_atomic_type(*p);				\
>  	switch (sizeof(*p)) {						\
> +	case 1:								\
> +		asm volatile ("stlrb %w1, %0"				\
> +				: "=Q" (*p) : "r" (v) : "memory");	\
> +		break;							\
> +	case 2:								\
> +		asm volatile ("stlrh %w1, %0"				\
> +				: "=Q" (*p) : "r" (v) : "memory");	\
> +		break;							\
>  	case 4:								\
>  		asm volatile ("stlr %w1, %0"				\
>  				: "=Q" (*p) : "r" (v) : "memory");	\
> @@ -81,6 +89,14 @@ do {									\
>  	typeof(*p) ___p1;						\
>  	compiletime_assert_atomic_type(*p);				\
>  	switch (sizeof(*p)) {						\
> +	case 1:								\
> +		asm volatile ("ldarb %w0, %1"				\
> +			: "=r" (___p1) : "Q" (*p) : "memory");		\
> +		break;							\
> +	case 2:								\
> +		asm volatile ("ldarh %w0, %1"				\
> +			: "=r" (___p1) : "Q" (*p) : "memory");		\
> +		break;							\
>  	case 4:								\
>  		asm volatile ("ldar %w0, %1"				\
>  			: "=r" (___p1) : "Q" (*p) : "memory");		\
> -- 
> 2.3.5
>
Paul E. McKenney April 20, 2015, 4:09 p.m. UTC | #4
On Mon, Apr 20, 2015 at 04:48:24PM +0100, Will Deacon wrote:
> Hi Andrey,
> 
> On Mon, Apr 20, 2015 at 04:45:53PM +0100, Andrey Ryabinin wrote:
> > commit 47933ad41a86 ("arch: Introduce smp_load_acquire(), smp_store_release()")
> > allowed only 4- and 8-byte smp_load_acquire, smp_store_release.
> > So 1- and 2-byte cases weren't implemented in arm64.
> > Later commit 536fa402221f ("compiler: Allow 1- and 2-byte smp_load_acquire()
> > 				and smp_store_release()")
> > allowed to use 1 and 2 byte smp_load_acquire and smp_store_release
> > by adjusting the definition of __native_word().
> > However, 1-,2- byte cases in arm64 version left unimplemented.
> > 
> > Commit 8053871d0f7f ("smp: Fix smp_call_function_single_async() locking")
> > started to use smp_load_acquire() to load 2-bytes csd->flags.
> > That crashes arm64 kernel during the boot.
> > 
> > Implement 1,2 byte cases in arm64's  smp_load_acquire()
> > and smp_store_release() to fix this.
> > 
> > Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
> 
> I already have an equivalent patch queued in the arm64/fixes branch[1]. I'll
> send a pull shortly.

Even better!  ;-)

							Thanx, Paul

> Will
> 
> [1]
> https://git.kernel.org/cgit/linux/kernel/git/arm64/linux.git/log/?h=fixes/core
> 
> > ---
> >  arch/arm64/include/asm/barrier.h | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> > index a5abb00..71f19c4 100644
> > --- a/arch/arm64/include/asm/barrier.h
> > +++ b/arch/arm64/include/asm/barrier.h
> > @@ -65,6 +65,14 @@ do {									\
> >  do {									\
> >  	compiletime_assert_atomic_type(*p);				\
> >  	switch (sizeof(*p)) {						\
> > +	case 1:								\
> > +		asm volatile ("stlrb %w1, %0"				\
> > +				: "=Q" (*p) : "r" (v) : "memory");	\
> > +		break;							\
> > +	case 2:								\
> > +		asm volatile ("stlrh %w1, %0"				\
> > +				: "=Q" (*p) : "r" (v) : "memory");	\
> > +		break;							\
> >  	case 4:								\
> >  		asm volatile ("stlr %w1, %0"				\
> >  				: "=Q" (*p) : "r" (v) : "memory");	\
> > @@ -81,6 +89,14 @@ do {									\
> >  	typeof(*p) ___p1;						\
> >  	compiletime_assert_atomic_type(*p);				\
> >  	switch (sizeof(*p)) {						\
> > +	case 1:								\
> > +		asm volatile ("ldarb %w0, %1"				\
> > +			: "=r" (___p1) : "Q" (*p) : "memory");		\
> > +		break;							\
> > +	case 2:								\
> > +		asm volatile ("ldarh %w0, %1"				\
> > +			: "=r" (___p1) : "Q" (*p) : "memory");		\
> > +		break;							\
> >  	case 4:								\
> >  		asm volatile ("ldar %w0, %1"				\
> >  			: "=r" (___p1) : "Q" (*p) : "memory");		\
> > -- 
> > 2.3.5
> > 
>
Will Deacon April 20, 2015, 4:13 p.m. UTC | #5
On Mon, Apr 20, 2015 at 05:08:37PM +0100, Linus Torvalds wrote:
> On Mon, Apr 20, 2015 at 8:45 AM, Andrey Ryabinin <a.ryabinin@samsung.com> wrote:
> >
> > Commit 8053871d0f7f ("smp: Fix smp_call_function_single_async() locking")
> > started to use smp_load_acquire() to load 2-bytes csd->flags.
> > That crashes arm64 kernel during the boot.
> 
> Ahh. I don't disagree with your patch, but we should just make
> csd->flags be an "int" regardless of this issue.
> 
> There is absolutely no reason to try to use 16-bit words. It doesn't
> save us any space (alignment), and it's just a *bad* idea to depend on
> any kind of atomicity of 16-bit entities. Bytes, yes. Words, yes. But
> 16-bit half-words? There's no point.
> 
> So I'll make it "unsigned int" instead of "u16".
> 
> But as said, the ARM64 patch definitely is a good idea regardless.

Sounds good to me and means I can leave the pull for a day or two to see
if any other arm64 fixes queue up.

Will
diff mbox

Patch

diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index a5abb00..71f19c4 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -65,6 +65,14 @@  do {									\
 do {									\
 	compiletime_assert_atomic_type(*p);				\
 	switch (sizeof(*p)) {						\
+	case 1:								\
+		asm volatile ("stlrb %w1, %0"				\
+				: "=Q" (*p) : "r" (v) : "memory");	\
+		break;							\
+	case 2:								\
+		asm volatile ("stlrh %w1, %0"				\
+				: "=Q" (*p) : "r" (v) : "memory");	\
+		break;							\
 	case 4:								\
 		asm volatile ("stlr %w1, %0"				\
 				: "=Q" (*p) : "r" (v) : "memory");	\
@@ -81,6 +89,14 @@  do {									\
 	typeof(*p) ___p1;						\
 	compiletime_assert_atomic_type(*p);				\
 	switch (sizeof(*p)) {						\
+	case 1:								\
+		asm volatile ("ldarb %w0, %1"				\
+			: "=r" (___p1) : "Q" (*p) : "memory");		\
+		break;							\
+	case 2:								\
+		asm volatile ("ldarh %w0, %1"				\
+			: "=r" (___p1) : "Q" (*p) : "memory");		\
+		break;							\
 	case 4:								\
 		asm volatile ("ldar %w0, %1"				\
 			: "=r" (___p1) : "Q" (*p) : "memory");		\