diff mbox

locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs

Message ID 20180505132751.gwzu2vbzibr2risd@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ingo Molnar May 5, 2018, 1:27 p.m. UTC
* Boqun Feng <boqun.feng@gmail.com> wrote:

> > May I suggest the patch below? No change in functionality, but it documents the 
> > lack of the cmpxchg_release() APIs and maps them explicitly to the full cmpxchg() 
> > version. (Which the generic code does now in a rather roundabout way.)
> > 
> 
> Hmm.. cmpxchg_release() is actually lwsync() + cmpxchg_relaxed(), but
> you just make it sync() + cmpxchg_relaxed() + sync() with the fallback,
> and sync() is much heavier, so I don't think the fallback is correct.

Indeed!

The bit I missed previously is that PowerPC provides its own __atomic_op_release() 
method:

   #define __atomic_op_release(op, args...)                                \
   ({                                                                      \
           __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");    \
           op##_relaxed(args);                                             \
   })

... which maps to LWSYNC as you say, and my patch made that worse.

> I think maybe you can move powerpc's __atomic_op_{acqurie,release}()
> from atomic.h to cmpxchg.h (in arch/powerpc/include/asm), and
> 
> 	#define cmpxchg_release __atomic_op_release(cmpxchg, __VA_ARGS__);
> 	#define cmpxchg64_release __atomic_op_release(cmpxchg64, __VA_ARGS__);
> 
> I put a diff below to say what I mean (untested).
> 
> > Also, the change to arch/powerpc/include/asm/atomic.h has no functional effect 
> > right now either, but should anyone add a _relaxed() variant in the future, with 
> > this change atomic_cmpxchg_release() and atomic64_cmpxchg_release() will pick that 
> > up automatically.
> > 
> 
> You mean with your other modification in include/linux/atomic.h, right?
> Because with the unmodified include/linux/atomic.h, we already pick that
> autmatically. If so, I think that's fine.
> 
> Here is the diff for the modification for cmpxchg_release(), the idea is
> we generate them in asm/cmpxchg.h other than linux/atomic.h for ppc, so
> we keep the new linux/atomic.h working. Because if I understand
> correctly, the next linux/atomic.h only accepts that
> 
> 1)	architecture only defines fully ordered primitives
> 
> or
> 
> 2)	architecture only defines _relaxed primitives
> 
> or
> 
> 3)	architecture defines all four (fully, _relaxed, _acquire,
> 	_release) primitives
> 
> So powerpc needs to define all four primitives in its only
> asm/cmpxchg.h.

Correct, although the new logic is still RFC, PeterZ didn't like the first version 
I proposed and might NAK them.

Thanks for the patch - I have created the patch below from it and added your 
Signed-off-by.

The only change I made beyond a trivial build fix is that I also added the release 
atomics variants explicitly:

+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))

It has passed a PowerPC cross-build test here, but no runtime tests.

Does this patch look good to you?

(Still subject to PeterZ's Ack/NAK.)

Thanks,

	Ingo

Comments

Boqun Feng May 5, 2018, 2:03 p.m. UTC | #1
On Sat, May 05, 2018 at 03:27:51PM +0200, Ingo Molnar wrote:
> 
> * Boqun Feng <boqun.feng@gmail.com> wrote:
> 
> > > May I suggest the patch below? No change in functionality, but it documents the 
> > > lack of the cmpxchg_release() APIs and maps them explicitly to the full cmpxchg() 
> > > version. (Which the generic code does now in a rather roundabout way.)
> > > 
> > 
> > Hmm.. cmpxchg_release() is actually lwsync() + cmpxchg_relaxed(), but
> > you just make it sync() + cmpxchg_relaxed() + sync() with the fallback,
> > and sync() is much heavier, so I don't think the fallback is correct.
> 
> Indeed!
> 
> The bit I missed previously is that PowerPC provides its own __atomic_op_release() 
> method:
> 
>    #define __atomic_op_release(op, args...)                                \
>    ({                                                                      \
>            __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");    \
>            op##_relaxed(args);                                             \
>    })
> 
> ... which maps to LWSYNC as you say, and my patch made that worse.
> 
> > I think maybe you can move powerpc's __atomic_op_{acqurie,release}()
> > from atomic.h to cmpxchg.h (in arch/powerpc/include/asm), and
> > 
> > 	#define cmpxchg_release __atomic_op_release(cmpxchg, __VA_ARGS__);
> > 	#define cmpxchg64_release __atomic_op_release(cmpxchg64, __VA_ARGS__);
> > 
> > I put a diff below to say what I mean (untested).
> > 
> > > Also, the change to arch/powerpc/include/asm/atomic.h has no functional effect 
> > > right now either, but should anyone add a _relaxed() variant in the future, with 
> > > this change atomic_cmpxchg_release() and atomic64_cmpxchg_release() will pick that 
> > > up automatically.
> > > 
> > 
> > You mean with your other modification in include/linux/atomic.h, right?
> > Because with the unmodified include/linux/atomic.h, we already pick that
> > autmatically. If so, I think that's fine.
> > 
> > Here is the diff for the modification for cmpxchg_release(), the idea is
> > we generate them in asm/cmpxchg.h other than linux/atomic.h for ppc, so
> > we keep the new linux/atomic.h working. Because if I understand
> > correctly, the next linux/atomic.h only accepts that
> > 
> > 1)	architecture only defines fully ordered primitives
> > 
> > or
> > 
> > 2)	architecture only defines _relaxed primitives
> > 
> > or
> > 
> > 3)	architecture defines all four (fully, _relaxed, _acquire,
> > 	_release) primitives
> > 
> > So powerpc needs to define all four primitives in its only
> > asm/cmpxchg.h.
> 
> Correct, although the new logic is still RFC, PeterZ didn't like the first version 
> I proposed and might NAK them.
> 

Understood. From my side, I don't have strong feelings for either way.
But since powerpc gets affected with the new logic, so I'm glad I could
help.

> Thanks for the patch - I have created the patch below from it and added your 
> Signed-off-by.
> 

Thanks ;-)

> The only change I made beyond a trivial build fix is that I also added the release 
> atomics variants explicitly:
> 
> +#define atomic_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
> +#define atomic64_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
> 
> It has passed a PowerPC cross-build test here, but no runtime tests.
> 

Do you have the commit at any branch in tip tree? I could pull it and
cross-build and check the assembly code of lib/atomic64_test.c, that way
I could verify whether we mess something up.

> Does this patch look good to you?
> 

Yep!

Regards,
Boqun

> (Still subject to PeterZ's Ack/NAK.)
> 
> Thanks,
> 
> 	Ingo
> 
> ======================>
> From: Boqun Feng <boqun.feng@gmail.com>
> Date: Sat, 5 May 2018 19:28:17 +0800
> Subject: [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
> 
> Move PowerPC's __op_{acqurie,release}() from atomic.h to
> cmpxchg.h (in arch/powerpc/include/asm), plus use them to
> define these two methods:
> 
> 	#define cmpxchg_release __op_release(cmpxchg, __VA_ARGS__);
> 	#define cmpxchg64_release __op_release(cmpxchg64, __VA_ARGS__);
> 
> ... the idea is to generate all these methods in cmpxchg.h and to define the full
> array of atomic primitives, including the cmpxchg_release() methods which were
> defined by the generic code before.
> 
> Also define the atomic[64]_() variants explicitly.
> 
> This ensures that all these low level cmpxchg APIs are defined in
> PowerPC headers, with no generic header fallbacks.
> 
> No change in functionality or code generation.
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: aryabinin@virtuozzo.com
> Cc: catalin.marinas@arm.com
> Cc: dvyukov@google.com
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: will.deacon@arm.com
> Link: http://lkml.kernel.org/r/20180505112817.ihrb726i37bwm4cj@tardis
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  arch/powerpc/include/asm/atomic.h  | 22 ++++------------------
>  arch/powerpc/include/asm/cmpxchg.h | 24 ++++++++++++++++++++++++
>  2 files changed, 28 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
> index 682b3e6a1e21..4e06955ec10f 100644
> --- a/arch/powerpc/include/asm/atomic.h
> +++ b/arch/powerpc/include/asm/atomic.h
> @@ -13,24 +13,6 @@
>  
>  #define ATOMIC_INIT(i)		{ (i) }
>  
> -/*
> - * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
> - * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> - * on the platform without lwsync.
> - */
> -#define __atomic_op_acquire(op, args...)				\
> -({									\
> -	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
> -	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
> -	__ret;								\
> -})
> -
> -#define __atomic_op_release(op, args...)				\
> -({									\
> -	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
> -	op##_relaxed(args);						\
> -})
> -
>  static __inline__ int atomic_read(const atomic_t *v)
>  {
>  	int t;
> @@ -213,6 +195,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
>  	cmpxchg_relaxed(&((v)->counter), (o), (n))
>  #define atomic_cmpxchg_acquire(v, o, n) \
>  	cmpxchg_acquire(&((v)->counter), (o), (n))
> +#define atomic_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
>  
>  #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
>  #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
> @@ -519,6 +503,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
>  	cmpxchg_relaxed(&((v)->counter), (o), (n))
>  #define atomic64_cmpxchg_acquire(v, o, n) \
>  	cmpxchg_acquire(&((v)->counter), (o), (n))
> +#define atomic64_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
>  
>  #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
>  #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
> diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
> index 9b001f1f6b32..e27a612b957f 100644
> --- a/arch/powerpc/include/asm/cmpxchg.h
> +++ b/arch/powerpc/include/asm/cmpxchg.h
> @@ -8,6 +8,24 @@
>  #include <asm/asm-compat.h>
>  #include <linux/bug.h>
>  
> +/*
> + * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
> + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> + * on the platform without lwsync.
> + */
> +#define __atomic_op_acquire(op, args...)				\
> +({									\
> +	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
> +	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
> +	__ret;								\
> +})
> +
> +#define __atomic_op_release(op, args...)				\
> +({									\
> +	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
> +	op##_relaxed(args);						\
> +})
> +
>  #ifdef __BIG_ENDIAN
>  #define BITOFF_CAL(size, off)	((sizeof(u32) - size - off) * BITS_PER_BYTE)
>  #else
> @@ -512,6 +530,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
>  			(unsigned long)_o_, (unsigned long)_n_,		\
>  			sizeof(*(ptr)));				\
>  })
> +
> +#define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__)
> +
>  #ifdef CONFIG_PPC64
>  #define cmpxchg64(ptr, o, n)						\
>    ({									\
> @@ -533,6 +554,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
>  	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
>  	cmpxchg_acquire((ptr), (o), (n));				\
>  })
> +
> +#define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__)
> +
>  #else
>  #include <asm-generic/cmpxchg-local.h>
>  #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
Ingo Molnar May 6, 2018, 12:11 p.m. UTC | #2
* Boqun Feng <boqun.feng@gmail.com> wrote:

> > The only change I made beyond a trivial build fix is that I also added the release 
> > atomics variants explicitly:
> > 
> > +#define atomic_cmpxchg_release(v, o, n) \
> > +	cmpxchg_release(&((v)->counter), (o), (n))
> > +#define atomic64_cmpxchg_release(v, o, n) \
> > +	cmpxchg_release(&((v)->counter), (o), (n))
> > 
> > It has passed a PowerPC cross-build test here, but no runtime tests.
> > 
> 
> Do you have the commit at any branch in tip tree? I could pull it and
> cross-build and check the assembly code of lib/atomic64_test.c, that way
> I could verify whether we mess something up.
> 
> > Does this patch look good to you?
> > 
> 
> Yep!

Great - I have pushed the commits out into the locking tree, they can be found in:

  git fetch git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git locking/core

The PowerPC preparatory commit from you is:

  0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs

Thanks,

	Ingo
Boqun Feng May 7, 2018, 1:04 a.m. UTC | #3
On Sun, May 6, 2018, at 8:11 PM, Ingo Molnar wrote:
> 
> * Boqun Feng <boqun.feng@gmail.com> wrote:
> 
> > > The only change I made beyond a trivial build fix is that I also added the release 
> > > atomics variants explicitly:
> > > 
> > > +#define atomic_cmpxchg_release(v, o, n) \
> > > +	cmpxchg_release(&((v)->counter), (o), (n))
> > > +#define atomic64_cmpxchg_release(v, o, n) \
> > > +	cmpxchg_release(&((v)->counter), (o), (n))
> > > 
> > > It has passed a PowerPC cross-build test here, but no runtime tests.
> > > 
> > 
> > Do you have the commit at any branch in tip tree? I could pull it and
> > cross-build and check the assembly code of lib/atomic64_test.c, that way
> > I could verify whether we mess something up.
> > 
> > > Does this patch look good to you?
> > > 
> > 
> > Yep!
> 
> Great - I have pushed the commits out into the locking tree, they can be 
> found in:
> 
>   git fetch git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> locking/core
> 

Thanks! My compile test told me that we need to remove the definitions of 
atomic_xchg and atomic64_xchg in ppc's asm/atomic.h: they are now
duplicate, and will prevent the generation of _release and _acquire in the
new logic.

If you need a updated patch for this from me, I could send later today.
(I don't have a  handy environment for patch sending now, so...)

Other than this, the modification looks fine, the lib/atomic64_test.c
generated the same asm before and after the patches.

Regards,
Boqun

> The PowerPC preparatory commit from you is:
> 
>   0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/
> cmpxchg.h and define the full set of cmpxchg APIs
> 
> Thanks,
> 
> 	Ingo
Ingo Molnar May 7, 2018, 6:50 a.m. UTC | #4
* Boqun Feng <boqun.feng@gmail.com> wrote:

> 
> 
> On Sun, May 6, 2018, at 8:11 PM, Ingo Molnar wrote:
> > 
> > * Boqun Feng <boqun.feng@gmail.com> wrote:
> > 
> > > > The only change I made beyond a trivial build fix is that I also added the release 
> > > > atomics variants explicitly:
> > > > 
> > > > +#define atomic_cmpxchg_release(v, o, n) \
> > > > +	cmpxchg_release(&((v)->counter), (o), (n))
> > > > +#define atomic64_cmpxchg_release(v, o, n) \
> > > > +	cmpxchg_release(&((v)->counter), (o), (n))
> > > > 
> > > > It has passed a PowerPC cross-build test here, but no runtime tests.
> > > > 
> > > 
> > > Do you have the commit at any branch in tip tree? I could pull it and
> > > cross-build and check the assembly code of lib/atomic64_test.c, that way
> > > I could verify whether we mess something up.
> > > 
> > > > Does this patch look good to you?
> > > > 
> > > 
> > > Yep!
> > 
> > Great - I have pushed the commits out into the locking tree, they can be 
> > found in:
> > 
> >   git fetch git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> > locking/core
> > 
> 
> Thanks! My compile test told me that we need to remove the definitions of 
> atomic_xchg and atomic64_xchg in ppc's asm/atomic.h: they are now
> duplicate, and will prevent the generation of _release and _acquire in the
> new logic.
> 
> If you need a updated patch for this from me, I could send later today.
> (I don't have a  handy environment for patch sending now, so...)

That would be cool, thanks! My own cross-build testing didn't trigger that build 
failure.

> Other than this, the modification looks fine, the lib/atomic64_test.c
> generated the same asm before and after the patches.

Cool, thanks for checking!

Thanks,

	Ingo
diff mbox

Patch

======================>
From: Boqun Feng <boqun.feng@gmail.com>
Date: Sat, 5 May 2018 19:28:17 +0800
Subject: [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs

Move PowerPC's __op_{acqurie,release}() from atomic.h to
cmpxchg.h (in arch/powerpc/include/asm), plus use them to
define these two methods:

	#define cmpxchg_release __op_release(cmpxchg, __VA_ARGS__);
	#define cmpxchg64_release __op_release(cmpxchg64, __VA_ARGS__);

... the idea is to generate all these methods in cmpxchg.h and to define the full
array of atomic primitives, including the cmpxchg_release() methods which were
defined by the generic code before.

Also define the atomic[64]_() variants explicitly.

This ensures that all these low level cmpxchg APIs are defined in
PowerPC headers, with no generic header fallbacks.

No change in functionality or code generation.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: aryabinin@virtuozzo.com
Cc: catalin.marinas@arm.com
Cc: dvyukov@google.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/20180505112817.ihrb726i37bwm4cj@tardis
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/powerpc/include/asm/atomic.h  | 22 ++++------------------
 arch/powerpc/include/asm/cmpxchg.h | 24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 682b3e6a1e21..4e06955ec10f 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -13,24 +13,6 @@ 
 
 #define ATOMIC_INIT(i)		{ (i) }
 
-/*
- * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
- * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
- * on the platform without lwsync.
- */
-#define __atomic_op_acquire(op, args...)				\
-({									\
-	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
-	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
-	__ret;								\
-})
-
-#define __atomic_op_release(op, args...)				\
-({									\
-	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
-	op##_relaxed(args);						\
-})
-
 static __inline__ int atomic_read(const atomic_t *v)
 {
 	int t;
@@ -213,6 +195,8 @@  static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
@@ -519,6 +503,8 @@  static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic64_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 9b001f1f6b32..e27a612b957f 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -8,6 +8,24 @@ 
 #include <asm/asm-compat.h>
 #include <linux/bug.h>
 
+/*
+ * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
+ * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
+ * on the platform without lwsync.
+ */
+#define __atomic_op_acquire(op, args...)				\
+({									\
+	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
+	__ret;								\
+})
+
+#define __atomic_op_release(op, args...)				\
+({									\
+	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
+	op##_relaxed(args);						\
+})
+
 #ifdef __BIG_ENDIAN
 #define BITOFF_CAL(size, off)	((sizeof(u32) - size - off) * BITS_PER_BYTE)
 #else
@@ -512,6 +530,9 @@  __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			(unsigned long)_o_, (unsigned long)_n_,		\
 			sizeof(*(ptr)));				\
 })
+
+#define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__)
+
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -533,6 +554,9 @@  __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_acquire((ptr), (o), (n));				\
 })
+
+#define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__)
+
 #else
 #include <asm-generic/cmpxchg-local.h>
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))