[3/5] arm64: atomics: lse: define ANDs in terms of ANDNOTs

Message ID	20211210151410.2782645-4-mark.rutland@arm.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org> From: Mark Rutland <mark.rutland@arm.com> To: linux-arm-kernel@lists.infradead.org Cc: boqun.feng@gmail.com, catalin.marinas@arm.com, mark.rutland@arm.com, peterz@infradead.org, will@kernel.org Subject: [PATCH 3/5] arm64: atomics: lse: define ANDs in terms of ANDNOTs Date: Fri, 10 Dec 2021 15:14:08 +0000 Message-Id: <20211210151410.2782645-4-mark.rutland@arm.com> In-Reply-To: <20211210151410.2782645-1-mark.rutland@arm.com> References: <20211210151410.2782645-1-mark.rutland@arm.com> MIME-Version: 1.0 Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org
Series	arm64: atomics: cleanups and codegen improvements \| expand [0/5] arm64: atomics: cleanups and codegen improvements [1/5] arm64: atomics: format whitespace consistently [2/5] arm64: atomics lse: define SUBs in terms of ADDs [3/5] arm64: atomics: lse: define ANDs in terms of ANDNOTs [4/5] arm64: atomics: lse: improve constraints for simple ops [5/5] arm64: atomics: lse: define RETURN ops in terms of FETCH ops

Message ID

20211210151410.2782645-4-mark.rutland@arm.com (mailing list archive)

State

New, archived

Headers

From: Mark Rutland <mark.rutland@arm.com>
To: linux-arm-kernel@lists.infradead.org
Cc: boqun.feng@gmail.com, catalin.marinas@arm.com, mark.rutland@arm.com,
 peterz@infradead.org, will@kernel.org
Subject: [PATCH 3/5] arm64: atomics: lse: define ANDs in terms of ANDNOTs
Date: Fri, 10 Dec 2021 15:14:08 +0000
Message-Id: <20211210151410.2782645-4-mark.rutland@arm.com>
In-Reply-To: <20211210151410.2782645-1-mark.rutland@arm.com>
References: <20211210151410.2782645-1-mark.rutland@arm.com>
MIME-Version: 1.0
Precedence: list
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: 
 linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org

Series

arm64: atomics: cleanups and codegen improvements | expand

Commit Message

Mark Rutland Dec. 10, 2021, 3:14 p.m. UTC

The FEAT_LSE atomic instructions include atomic bit-clear instructions
(`ldclr*` and `stclr*`) which can be used to directly implement ANDNOT
operations. Each AND op is implemented as a copy of the corresponding
ANDNOT op with a leading `mvn` instruction to apply a bitwise NOT to the
`i` argument.

As the compiler has no visibility of the `mvn`, this leads to less than
optimal code generation when generating `i` into a register. For
example, __lse_atomic_fetch_and(0xf, v) can be compiled to:

	mov     w1, #0xf
	mvn     w1, w1
	ldclral w1, w1, [x2]

This patch improves this by replacing the `mvn` with NOT in C before the
inline assembly block, e.g.

	i = ~i;

This allows the compiler to generate `i` into a register more optimally,
e.g.

	mov     w1, #0xfffffff0
	ldclral w1, w1, [x2]

With this change the assembly for each AND op is identical to the
corresponding ANDNOT op (including barriers and clobbers), so I've
removed the inline assembly and rewritten each AND op in terms of the
corresponding ANDNOT op, e.g.

| static inline void __lse_atomic_and(int i, atomic_t *v)
| {
| 	return __lse_atomic_andnot(~i, v);
| }

This is intended as an optimization and cleanup.
There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/atomic_lse.h | 34 ++++-------------------------
 1 file changed, 4 insertions(+), 30 deletions(-)

Comments

Will Deacon Dec. 13, 2021, 7:29 p.m. UTC | #1

On Fri, Dec 10, 2021 at 03:14:08PM +0000, Mark Rutland wrote:
> The FEAT_LSE atomic instructions include atomic bit-clear instructions
> (`ldclr*` and `stclr*`) which can be used to directly implement ANDNOT
> operations. Each AND op is implemented as a copy of the corresponding
> ANDNOT op with a leading `mvn` instruction to apply a bitwise NOT to the
> `i` argument.
> 
> As the compiler has no visibility of the `mvn`, this leads to less than
> optimal code generation when generating `i` into a register. For
> example, __lse_atomic_fetch_and(0xf, v) can be compiled to:
> 
> 	mov     w1, #0xf
> 	mvn     w1, w1
> 	ldclral w1, w1, [x2]
> 
> This patch improves this by replacing the `mvn` with NOT in C before the
> inline assembly block, e.g.
> 
> 	i = ~i;
> 
> This allows the compiler to generate `i` into a register more optimally,
> e.g.
> 
> 	mov     w1, #0xfffffff0
> 	ldclral w1, w1, [x2]
> 
> With this change the assembly for each AND op is identical to the
> corresponding ANDNOT op (including barriers and clobbers), so I've
> removed the inline assembly and rewritten each AND op in terms of the
> corresponding ANDNOT op, e.g.
> 
> | static inline void __lse_atomic_and(int i, atomic_t *v)
> | {
> | 	return __lse_atomic_andnot(~i, v);
> | }
> 
> This is intended as an optimization and cleanup.
> There should be no functional change as a result of this patch.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/atomic_lse.h | 34 ++++-------------------------
>  1 file changed, 4 insertions(+), 30 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will

diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
index 7454febb6d77..d707eafb7677 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -102,26 +102,13 @@  ATOMIC_OP_ADD_SUB_RETURN(        , al, "memory")
 
 static inline void __lse_atomic_and(int i, atomic_t *v)
 {
-	asm volatile(
-	__LSE_PREAMBLE
-	"	mvn	%w[i], %w[i]\n"
-	"	stclr	%w[i], %[v]"
-	: [i] "+&r" (i), [v] "+Q" (v->counter)
-	: "r" (v));
+	return __lse_atomic_andnot(~i, v);
 }
 
 #define ATOMIC_FETCH_OP_AND(name, mb, cl...)				\
 static inline int __lse_atomic_fetch_and##name(int i, atomic_t *v)	\
 {									\
-	asm volatile(							\
-	__LSE_PREAMBLE							\
-	"	mvn	%w[i], %w[i]\n"					\
-	"	ldclr" #mb "	%w[i], %w[i], %[v]"			\
-	: [i] "+&r" (i), [v] "+Q" (v->counter)				\
-	: "r" (v)							\
-	: cl);								\
-									\
-	return i;							\
+	return __lse_atomic_fetch_andnot##name(~i, v);			\
 }
 
 ATOMIC_FETCH_OP_AND(_relaxed,   )
@@ -223,26 +210,13 @@  ATOMIC64_OP_ADD_SUB_RETURN(        , al, "memory")
 
 static inline void __lse_atomic64_and(s64 i, atomic64_t *v)
 {
-	asm volatile(
-	__LSE_PREAMBLE
-	"	mvn	%[i], %[i]\n"
-	"	stclr	%[i], %[v]"
-	: [i] "+&r" (i), [v] "+Q" (v->counter)
-	: "r" (v));
+	return __lse_atomic64_andnot(~i, v);
 }
 
 #define ATOMIC64_FETCH_OP_AND(name, mb, cl...)				\
 static inline long __lse_atomic64_fetch_and##name(s64 i, atomic64_t *v)	\
 {									\
-	asm volatile(							\
-	__LSE_PREAMBLE							\
-	"	mvn	%[i], %[i]\n"					\
-	"	ldclr" #mb "	%[i], %[i], %[v]"			\
-	: [i] "+&r" (i), [v] "+Q" (v->counter)				\
-	: "r" (v)							\
-	: cl);								\
-									\
-	return i;							\
+	return __lse_atomic64_fetch_andnot##name(~i, v);		\
 }
 
 ATOMIC64_FETCH_OP_AND(_relaxed,   )

[3/5] arm64: atomics: lse: define ANDs in terms of ANDNOTs

Commit Message

Comments

Patch