From patchwork Fri Dec 10 15:14:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Rutland X-Patchwork-Id: 12695611 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A3FEDC433EF for ; Fri, 10 Dec 2021 15:16:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=cc2nB3ocf9AZ5w2e79HSyJ92eOMEsttmz3pG9/D23e8=; b=zIfgXt9Oyc6qJX 0z6M+03n9tzbsU197QIuh2H70Aw3C06ZyXADruVb8rlj21D5EyUxumfICxDXq9zNssQqvK6HNKnli SVhoiWvPFcUpxMICQTBFRTSzwhOv2hdcVISiDHFQ5gRAU6krm1v4xoaPH8rzabFN6G+M5QxZO4ebw 3RqoRjDASxcDkGFeeaNBek82k9XLQDT7S9pAmSjhfjxTbirvC0+OqvJn2Ea/5NHNVyO+yah9mZBWw ZDOakHFEIw7CEqGKBbCtFLyXbZ3XXB/JLP0iTAAwpbE5e23Kqm56Ajr9soxtWU3DWDTFtHpEb2RVY jHU+c42yrs7QIEJziE1Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mvhbn-002LQ3-Ql; Fri, 10 Dec 2021 15:14:51 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mvhbI-002LFj-JD for linux-arm-kernel@lists.infradead.org; Fri, 10 Dec 2021 15:14:24 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3C2C412FC; Fri, 10 Dec 2021 07:14:18 -0800 (PST) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 570133F5A1; Fri, 10 Dec 2021 07:14:17 -0800 (PST) From: Mark Rutland To: linux-arm-kernel@lists.infradead.org Cc: boqun.feng@gmail.com, catalin.marinas@arm.com, mark.rutland@arm.com, peterz@infradead.org, will@kernel.org Subject: [PATCH 1/5] arm64: atomics: format whitespace consistently Date: Fri, 10 Dec 2021 15:14:06 +0000 Message-Id: <20211210151410.2782645-2-mark.rutland@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211210151410.2782645-1-mark.rutland@arm.com> References: <20211210151410.2782645-1-mark.rutland@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211210_071420_755311_6271EC75 X-CRM114-Status: UNSURE ( 9.42 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The code for the atomic ops is formatted inconsistently, and while this is not a functional problem it is rather distracting when working on them. Some have ops have consistent indentation, e.g. | #define ATOMIC_OP_ADD_RETURN(name, mb, cl...) \ | static inline int __lse_atomic_add_return##name(int i, atomic_t *v) \ | { \ | u32 tmp; \ | \ | asm volatile( \ | __LSE_PREAMBLE \ | " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \ | " add %w[i], %w[i], %w[tmp]" \ | : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \ | : "r" (v) \ | : cl); \ | \ | return i; \ | } While others have negative indentation for some lines, and/or have misaligned trailing backslashes, e.g. | static inline void __lse_atomic_##op(int i, atomic_t *v) \ | { \ | asm volatile( \ | __LSE_PREAMBLE \ | " " #asm_op " %w[i], %[v]\n" \ | : [i] "+r" (i), [v] "+Q" (v->counter) \ | : "r" (v)); \ | } This patch makes the indentation consistent and also aligns the trailing backslashes. This makes the code easier to read for those (like myself) who are easily distracted by these inconsistencies. This is intended as a cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland Cc: Boqun Feng Cc: Catalin Marinas Cc: Peter Zijlstra Cc: Will Deacon Acked-by: Will Deacon --- arch/arm64/include/asm/atomic_ll_sc.h | 86 +++++++++++++-------------- arch/arm64/include/asm/atomic_lse.h | 14 ++--- 2 files changed, 50 insertions(+), 50 deletions(-) diff --git a/arch/arm64/include/asm/atomic_ll_sc.h b/arch/arm64/include/asm/atomic_ll_sc.h index 13869b76b58c..fe0db8d416fb 100644 --- a/arch/arm64/include/asm/atomic_ll_sc.h +++ b/arch/arm64/include/asm/atomic_ll_sc.h @@ -44,11 +44,11 @@ __ll_sc_atomic_##op(int i, atomic_t *v) \ \ asm volatile("// atomic_" #op "\n" \ __LL_SC_FALLBACK( \ -" prfm pstl1strm, %2\n" \ -"1: ldxr %w0, %2\n" \ -" " #asm_op " %w0, %w0, %w3\n" \ -" stxr %w1, %w0, %2\n" \ -" cbnz %w1, 1b\n") \ + " prfm pstl1strm, %2\n" \ + "1: ldxr %w0, %2\n" \ + " " #asm_op " %w0, %w0, %w3\n" \ + " stxr %w1, %w0, %2\n" \ + " cbnz %w1, 1b\n") \ : "=&r" (result), "=&r" (tmp), "+Q" (v->counter) \ : __stringify(constraint) "r" (i)); \ } @@ -62,12 +62,12 @@ __ll_sc_atomic_##op##_return##name(int i, atomic_t *v) \ \ asm volatile("// atomic_" #op "_return" #name "\n" \ __LL_SC_FALLBACK( \ -" prfm pstl1strm, %2\n" \ -"1: ld" #acq "xr %w0, %2\n" \ -" " #asm_op " %w0, %w0, %w3\n" \ -" st" #rel "xr %w1, %w0, %2\n" \ -" cbnz %w1, 1b\n" \ -" " #mb ) \ + " prfm pstl1strm, %2\n" \ + "1: ld" #acq "xr %w0, %2\n" \ + " " #asm_op " %w0, %w0, %w3\n" \ + " st" #rel "xr %w1, %w0, %2\n" \ + " cbnz %w1, 1b\n" \ + " " #mb ) \ : "=&r" (result), "=&r" (tmp), "+Q" (v->counter) \ : __stringify(constraint) "r" (i) \ : cl); \ @@ -84,12 +84,12 @@ __ll_sc_atomic_fetch_##op##name(int i, atomic_t *v) \ \ asm volatile("// atomic_fetch_" #op #name "\n" \ __LL_SC_FALLBACK( \ -" prfm pstl1strm, %3\n" \ -"1: ld" #acq "xr %w0, %3\n" \ -" " #asm_op " %w1, %w0, %w4\n" \ -" st" #rel "xr %w2, %w1, %3\n" \ -" cbnz %w2, 1b\n" \ -" " #mb ) \ + " prfm pstl1strm, %3\n" \ + "1: ld" #acq "xr %w0, %3\n" \ + " " #asm_op " %w1, %w0, %w4\n" \ + " st" #rel "xr %w2, %w1, %3\n" \ + " cbnz %w2, 1b\n" \ + " " #mb ) \ : "=&r" (result), "=&r" (val), "=&r" (tmp), "+Q" (v->counter) \ : __stringify(constraint) "r" (i) \ : cl); \ @@ -143,11 +143,11 @@ __ll_sc_atomic64_##op(s64 i, atomic64_t *v) \ \ asm volatile("// atomic64_" #op "\n" \ __LL_SC_FALLBACK( \ -" prfm pstl1strm, %2\n" \ -"1: ldxr %0, %2\n" \ -" " #asm_op " %0, %0, %3\n" \ -" stxr %w1, %0, %2\n" \ -" cbnz %w1, 1b") \ + " prfm pstl1strm, %2\n" \ + "1: ldxr %0, %2\n" \ + " " #asm_op " %0, %0, %3\n" \ + " stxr %w1, %0, %2\n" \ + " cbnz %w1, 1b") \ : "=&r" (result), "=&r" (tmp), "+Q" (v->counter) \ : __stringify(constraint) "r" (i)); \ } @@ -161,12 +161,12 @@ __ll_sc_atomic64_##op##_return##name(s64 i, atomic64_t *v) \ \ asm volatile("// atomic64_" #op "_return" #name "\n" \ __LL_SC_FALLBACK( \ -" prfm pstl1strm, %2\n" \ -"1: ld" #acq "xr %0, %2\n" \ -" " #asm_op " %0, %0, %3\n" \ -" st" #rel "xr %w1, %0, %2\n" \ -" cbnz %w1, 1b\n" \ -" " #mb ) \ + " prfm pstl1strm, %2\n" \ + "1: ld" #acq "xr %0, %2\n" \ + " " #asm_op " %0, %0, %3\n" \ + " st" #rel "xr %w1, %0, %2\n" \ + " cbnz %w1, 1b\n" \ + " " #mb ) \ : "=&r" (result), "=&r" (tmp), "+Q" (v->counter) \ : __stringify(constraint) "r" (i) \ : cl); \ @@ -176,19 +176,19 @@ __ll_sc_atomic64_##op##_return##name(s64 i, atomic64_t *v) \ #define ATOMIC64_FETCH_OP(name, mb, acq, rel, cl, op, asm_op, constraint)\ static inline long \ -__ll_sc_atomic64_fetch_##op##name(s64 i, atomic64_t *v) \ +__ll_sc_atomic64_fetch_##op##name(s64 i, atomic64_t *v) \ { \ s64 result, val; \ unsigned long tmp; \ \ asm volatile("// atomic64_fetch_" #op #name "\n" \ __LL_SC_FALLBACK( \ -" prfm pstl1strm, %3\n" \ -"1: ld" #acq "xr %0, %3\n" \ -" " #asm_op " %1, %0, %4\n" \ -" st" #rel "xr %w2, %1, %3\n" \ -" cbnz %w2, 1b\n" \ -" " #mb ) \ + " prfm pstl1strm, %3\n" \ + "1: ld" #acq "xr %0, %3\n" \ + " " #asm_op " %1, %0, %4\n" \ + " st" #rel "xr %w2, %1, %3\n" \ + " cbnz %w2, 1b\n" \ + " " #mb ) \ : "=&r" (result), "=&r" (val), "=&r" (tmp), "+Q" (v->counter) \ : __stringify(constraint) "r" (i) \ : cl); \ @@ -241,14 +241,14 @@ __ll_sc_atomic64_dec_if_positive(atomic64_t *v) asm volatile("// atomic64_dec_if_positive\n" __LL_SC_FALLBACK( -" prfm pstl1strm, %2\n" -"1: ldxr %0, %2\n" -" subs %0, %0, #1\n" -" b.lt 2f\n" -" stlxr %w1, %0, %2\n" -" cbnz %w1, 1b\n" -" dmb ish\n" -"2:") + " prfm pstl1strm, %2\n" + "1: ldxr %0, %2\n" + " subs %0, %0, #1\n" + " b.lt 2f\n" + " stlxr %w1, %0, %2\n" + " cbnz %w1, 1b\n" + " dmb ish\n" + "2:") : "=&r" (result), "=&r" (tmp), "+Q" (v->counter) : : "cc", "memory"); diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index da3280f639cd..ab661375835e 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -11,11 +11,11 @@ #define __ASM_ATOMIC_LSE_H #define ATOMIC_OP(op, asm_op) \ -static inline void __lse_atomic_##op(int i, atomic_t *v) \ +static inline void __lse_atomic_##op(int i, atomic_t *v) \ { \ asm volatile( \ __LSE_PREAMBLE \ -" " #asm_op " %w[i], %[v]\n" \ + " " #asm_op " %w[i], %[v]\n" \ : [i] "+r" (i), [v] "+Q" (v->counter) \ : "r" (v)); \ } @@ -32,7 +32,7 @@ static inline int __lse_atomic_fetch_##op##name(int i, atomic_t *v) \ { \ asm volatile( \ __LSE_PREAMBLE \ -" " #asm_op #mb " %w[i], %w[i], %[v]" \ + " " #asm_op #mb " %w[i], %w[i], %[v]" \ : [i] "+r" (i), [v] "+Q" (v->counter) \ : "r" (v) \ : cl); \ @@ -130,7 +130,7 @@ static inline int __lse_atomic_sub_return##name(int i, atomic_t *v) \ " add %w[i], %w[i], %w[tmp]" \ : [i] "+&r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \ : "r" (v) \ - : cl); \ + : cl); \ \ return i; \ } @@ -168,7 +168,7 @@ static inline void __lse_atomic64_##op(s64 i, atomic64_t *v) \ { \ asm volatile( \ __LSE_PREAMBLE \ -" " #asm_op " %[i], %[v]\n" \ + " " #asm_op " %[i], %[v]\n" \ : [i] "+r" (i), [v] "+Q" (v->counter) \ : "r" (v)); \ } @@ -185,7 +185,7 @@ static inline long __lse_atomic64_fetch_##op##name(s64 i, atomic64_t *v)\ { \ asm volatile( \ __LSE_PREAMBLE \ -" " #asm_op #mb " %[i], %[i], %[v]" \ + " " #asm_op #mb " %[i], %[i], %[v]" \ : [i] "+r" (i), [v] "+Q" (v->counter) \ : "r" (v) \ : cl); \ @@ -272,7 +272,7 @@ static inline void __lse_atomic64_sub(s64 i, atomic64_t *v) } #define ATOMIC64_OP_SUB_RETURN(name, mb, cl...) \ -static inline long __lse_atomic64_sub_return##name(s64 i, atomic64_t *v) \ +static inline long __lse_atomic64_sub_return##name(s64 i, atomic64_t *v)\ { \ unsigned long tmp; \ \ From patchwork Fri Dec 10 15:14:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Rutland X-Patchwork-Id: 12695610 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 429C7C433EF for ; Fri, 10 Dec 2021 15:16:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=QkdFiCrB94ReJE51dH3nnoO9kEcJmZR/lPw69HpVAC0=; b=eZUJgVTg2mpV79 CF06C6MyKfKK7fPB3Qh6/h4X9s/+xefa01llO6lLDGyZTtyaVrSLiE0MY0mrX5sZfUeIGsuNp31sQ I/+Kl0mSBqWEuNyfxnwSux6PL6UKl3y81djO4kt2r9gOXaxWocuQYC80bcFh8zLOXbUYTYdk3Dbgl 0+jI15N+WhRvZ/dZ5fIBK3Hd9shhX85/+QZJYYKeOEFOllIuOaquVv8e8SsteczTxcEeCY2Nz2j2n 9pGjFE+oIFk+MwS9wj5L/34uZZfuU6eniUrKciFimjk60aez0etFiB26LWHKb/OWUGHFS6CIZvT2q kFt7DM1a+dnX/x6SZtVg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mvhbc-002LNW-Dq; Fri, 10 Dec 2021 15:14:40 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mvhbI-002LGU-Cb for linux-arm-kernel@lists.infradead.org; Fri, 10 Dec 2021 15:14:24 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BFE9C13A1; Fri, 10 Dec 2021 07:14:19 -0800 (PST) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id DB0A33F5A1; Fri, 10 Dec 2021 07:14:18 -0800 (PST) From: Mark Rutland To: linux-arm-kernel@lists.infradead.org Cc: boqun.feng@gmail.com, catalin.marinas@arm.com, mark.rutland@arm.com, peterz@infradead.org, will@kernel.org Subject: [PATCH 2/5] arm64: atomics lse: define SUBs in terms of ADDs Date: Fri, 10 Dec 2021 15:14:07 +0000 Message-Id: <20211210151410.2782645-3-mark.rutland@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211210151410.2782645-1-mark.rutland@arm.com> References: <20211210151410.2782645-1-mark.rutland@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211210_071420_598650_0C98DCDE X-CRM114-Status: GOOD ( 10.53 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The FEAT_LSE atomic instructions include atomic ADD instructions (`stadd*` and `ldadd*`), but do not include atomic SUB instructions, so we must build all of the SUB operations using the ADD instructions. We open-code these today, with each SUB op implemented as a copy of the corresponding ADD op with a leading `neg` instruction in the inline assembly to negate the `i` argument. As the compiler has no visibility of the `neg`, this leads to less than optimal code generation when generating `i` into a register. For example, __les_atomic_fetch_sub(1, v) can be compiled to: mov w1, #0x1 neg w1, w1 ldaddal w1, w1, [x2] This patch improves this by replacing the `neg` with negation in C before the inline assembly block, e.g. i = -i; This allows the compiler to generate `i` into a register more optimally, e.g. mov w1, #0xffffffff ldaddal w1, w1, [x2] With this change the assembly for each SUB op is identical to the corresponding ADD op (including barriers and clobbers), so I've removed the inline assembly and rewritten each SUB op in terms of the corresponding ADD op, e.g. | static inline void __lse_atomic_sub(int i, atomic_t *v) | { | __lse_atomic_add(-i, v); | } For clarity I've moved the definition of each SUB op immediately after the corresponding ADD op, and used a single macro to create the RETURN forms of both ops. This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland Cc: Boqun Feng Cc: Catalin Marinas Cc: Peter Zijlstra Cc: Will Deacon Acked-by: Will Deacon --- arch/arm64/include/asm/atomic_lse.h | 180 +++++++++------------------- 1 file changed, 58 insertions(+), 122 deletions(-) diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index ab661375835e..7454febb6d77 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -25,6 +25,11 @@ ATOMIC_OP(or, stset) ATOMIC_OP(xor, steor) ATOMIC_OP(add, stadd) +static inline void __lse_atomic_sub(int i, atomic_t *v) +{ + __lse_atomic_add(-i, v); +} + #undef ATOMIC_OP #define ATOMIC_FETCH_OP(name, mb, op, asm_op, cl...) \ @@ -54,7 +59,20 @@ ATOMIC_FETCH_OPS(add, ldadd) #undef ATOMIC_FETCH_OP #undef ATOMIC_FETCH_OPS -#define ATOMIC_OP_ADD_RETURN(name, mb, cl...) \ +#define ATOMIC_FETCH_OP_SUB(name) \ +static inline int __lse_atomic_fetch_sub##name(int i, atomic_t *v) \ +{ \ + return __lse_atomic_fetch_add##name(-i, v); \ +} + +ATOMIC_FETCH_OP_SUB(_relaxed) +ATOMIC_FETCH_OP_SUB(_acquire) +ATOMIC_FETCH_OP_SUB(_release) +ATOMIC_FETCH_OP_SUB( ) + +#undef ATOMIC_FETCH_OP_SUB + +#define ATOMIC_OP_ADD_SUB_RETURN(name, mb, cl...) \ static inline int __lse_atomic_add_return##name(int i, atomic_t *v) \ { \ u32 tmp; \ @@ -68,14 +86,19 @@ static inline int __lse_atomic_add_return##name(int i, atomic_t *v) \ : cl); \ \ return i; \ +} \ + \ +static inline int __lse_atomic_sub_return##name(int i, atomic_t *v) \ +{ \ + return __lse_atomic_add_return##name(-i, v); \ } -ATOMIC_OP_ADD_RETURN(_relaxed, ) -ATOMIC_OP_ADD_RETURN(_acquire, a, "memory") -ATOMIC_OP_ADD_RETURN(_release, l, "memory") -ATOMIC_OP_ADD_RETURN( , al, "memory") +ATOMIC_OP_ADD_SUB_RETURN(_relaxed, ) +ATOMIC_OP_ADD_SUB_RETURN(_acquire, a, "memory") +ATOMIC_OP_ADD_SUB_RETURN(_release, l, "memory") +ATOMIC_OP_ADD_SUB_RETURN( , al, "memory") -#undef ATOMIC_OP_ADD_RETURN +#undef ATOMIC_OP_ADD_SUB_RETURN static inline void __lse_atomic_and(int i, atomic_t *v) { @@ -108,61 +131,6 @@ ATOMIC_FETCH_OP_AND( , al, "memory") #undef ATOMIC_FETCH_OP_AND -static inline void __lse_atomic_sub(int i, atomic_t *v) -{ - asm volatile( - __LSE_PREAMBLE - " neg %w[i], %w[i]\n" - " stadd %w[i], %[v]" - : [i] "+&r" (i), [v] "+Q" (v->counter) - : "r" (v)); -} - -#define ATOMIC_OP_SUB_RETURN(name, mb, cl...) \ -static inline int __lse_atomic_sub_return##name(int i, atomic_t *v) \ -{ \ - u32 tmp; \ - \ - asm volatile( \ - __LSE_PREAMBLE \ - " neg %w[i], %w[i]\n" \ - " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \ - " add %w[i], %w[i], %w[tmp]" \ - : [i] "+&r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \ - : "r" (v) \ - : cl); \ - \ - return i; \ -} - -ATOMIC_OP_SUB_RETURN(_relaxed, ) -ATOMIC_OP_SUB_RETURN(_acquire, a, "memory") -ATOMIC_OP_SUB_RETURN(_release, l, "memory") -ATOMIC_OP_SUB_RETURN( , al, "memory") - -#undef ATOMIC_OP_SUB_RETURN - -#define ATOMIC_FETCH_OP_SUB(name, mb, cl...) \ -static inline int __lse_atomic_fetch_sub##name(int i, atomic_t *v) \ -{ \ - asm volatile( \ - __LSE_PREAMBLE \ - " neg %w[i], %w[i]\n" \ - " ldadd" #mb " %w[i], %w[i], %[v]" \ - : [i] "+&r" (i), [v] "+Q" (v->counter) \ - : "r" (v) \ - : cl); \ - \ - return i; \ -} - -ATOMIC_FETCH_OP_SUB(_relaxed, ) -ATOMIC_FETCH_OP_SUB(_acquire, a, "memory") -ATOMIC_FETCH_OP_SUB(_release, l, "memory") -ATOMIC_FETCH_OP_SUB( , al, "memory") - -#undef ATOMIC_FETCH_OP_SUB - #define ATOMIC64_OP(op, asm_op) \ static inline void __lse_atomic64_##op(s64 i, atomic64_t *v) \ { \ @@ -178,6 +146,11 @@ ATOMIC64_OP(or, stset) ATOMIC64_OP(xor, steor) ATOMIC64_OP(add, stadd) +static inline void __lse_atomic64_sub(s64 i, atomic64_t *v) +{ + __lse_atomic64_add(-i, v); +} + #undef ATOMIC64_OP #define ATOMIC64_FETCH_OP(name, mb, op, asm_op, cl...) \ @@ -207,7 +180,20 @@ ATOMIC64_FETCH_OPS(add, ldadd) #undef ATOMIC64_FETCH_OP #undef ATOMIC64_FETCH_OPS -#define ATOMIC64_OP_ADD_RETURN(name, mb, cl...) \ +#define ATOMIC64_FETCH_OP_SUB(name) \ +static inline long __lse_atomic64_fetch_sub##name(s64 i, atomic64_t *v) \ +{ \ + return __lse_atomic64_fetch_add##name(-i, v); \ +} + +ATOMIC64_FETCH_OP_SUB(_relaxed) +ATOMIC64_FETCH_OP_SUB(_acquire) +ATOMIC64_FETCH_OP_SUB(_release) +ATOMIC64_FETCH_OP_SUB( ) + +#undef ATOMIC64_FETCH_OP_SUB + +#define ATOMIC64_OP_ADD_SUB_RETURN(name, mb, cl...) \ static inline long __lse_atomic64_add_return##name(s64 i, atomic64_t *v)\ { \ unsigned long tmp; \ @@ -221,14 +207,19 @@ static inline long __lse_atomic64_add_return##name(s64 i, atomic64_t *v)\ : cl); \ \ return i; \ +} \ + \ +static inline long __lse_atomic64_sub_return##name(s64 i, atomic64_t *v)\ +{ \ + return __lse_atomic64_add_return##name(-i, v); \ } -ATOMIC64_OP_ADD_RETURN(_relaxed, ) -ATOMIC64_OP_ADD_RETURN(_acquire, a, "memory") -ATOMIC64_OP_ADD_RETURN(_release, l, "memory") -ATOMIC64_OP_ADD_RETURN( , al, "memory") +ATOMIC64_OP_ADD_SUB_RETURN(_relaxed, ) +ATOMIC64_OP_ADD_SUB_RETURN(_acquire, a, "memory") +ATOMIC64_OP_ADD_SUB_RETURN(_release, l, "memory") +ATOMIC64_OP_ADD_SUB_RETURN( , al, "memory") -#undef ATOMIC64_OP_ADD_RETURN +#undef ATOMIC64_OP_ADD_SUB_RETURN static inline void __lse_atomic64_and(s64 i, atomic64_t *v) { @@ -261,61 +252,6 @@ ATOMIC64_FETCH_OP_AND( , al, "memory") #undef ATOMIC64_FETCH_OP_AND -static inline void __lse_atomic64_sub(s64 i, atomic64_t *v) -{ - asm volatile( - __LSE_PREAMBLE - " neg %[i], %[i]\n" - " stadd %[i], %[v]" - : [i] "+&r" (i), [v] "+Q" (v->counter) - : "r" (v)); -} - -#define ATOMIC64_OP_SUB_RETURN(name, mb, cl...) \ -static inline long __lse_atomic64_sub_return##name(s64 i, atomic64_t *v)\ -{ \ - unsigned long tmp; \ - \ - asm volatile( \ - __LSE_PREAMBLE \ - " neg %[i], %[i]\n" \ - " ldadd" #mb " %[i], %x[tmp], %[v]\n" \ - " add %[i], %[i], %x[tmp]" \ - : [i] "+&r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \ - : "r" (v) \ - : cl); \ - \ - return i; \ -} - -ATOMIC64_OP_SUB_RETURN(_relaxed, ) -ATOMIC64_OP_SUB_RETURN(_acquire, a, "memory") -ATOMIC64_OP_SUB_RETURN(_release, l, "memory") -ATOMIC64_OP_SUB_RETURN( , al, "memory") - -#undef ATOMIC64_OP_SUB_RETURN - -#define ATOMIC64_FETCH_OP_SUB(name, mb, cl...) \ -static inline long __lse_atomic64_fetch_sub##name(s64 i, atomic64_t *v) \ -{ \ - asm volatile( \ - __LSE_PREAMBLE \ - " neg %[i], %[i]\n" \ - " ldadd" #mb " %[i], %[i], %[v]" \ - : [i] "+&r" (i), [v] "+Q" (v->counter) \ - : "r" (v) \ - : cl); \ - \ - return i; \ -} - -ATOMIC64_FETCH_OP_SUB(_relaxed, ) -ATOMIC64_FETCH_OP_SUB(_acquire, a, "memory") -ATOMIC64_FETCH_OP_SUB(_release, l, "memory") -ATOMIC64_FETCH_OP_SUB( , al, "memory") - -#undef ATOMIC64_FETCH_OP_SUB - static inline s64 __lse_atomic64_dec_if_positive(atomic64_t *v) { unsigned long tmp; From patchwork Fri Dec 10 15:14:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Rutland X-Patchwork-Id: 12695612 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A9D38C433F5 for ; Fri, 10 Dec 2021 15:16:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=OMVzFbLv4oL+PKzpWnn1ltCXypez3InaBh2c+DDb+/Y=; b=XeH3dzvlQoZafO avYY0Bl6480SolvTlDQj7+KjcYX0zLBN77yt6n687AaWeQHOPQmb5mhTBjSRUU9AI0xTflNtk2kMp IW1/G+1ExP0QdaQPjZ4eehOFSj9wiNGVnm+bpU6emNDek5VVb6VS1hW13AlQV/3t4PKZNnByN8QzW n4ES2nhgqROlqMXzZEih1hfzowmYTIhNB2G+XMrpNGLxA6FxCrm+vIcQLxK5tr9ispGHT0VlHL+PY zABjrISpJNkoi7vTLUu/Ep29Q5971SZ81ao9PFztO3j48/ga481N6i3StNGDjm+/jpXrEVblm5Kop m/904X6AgAmHMAOVHTHA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mvhbz-002LUR-PF; Fri, 10 Dec 2021 15:15:03 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mvhbJ-002LHU-TC for linux-arm-kernel@lists.infradead.org; Fri, 10 Dec 2021 15:14:26 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4155A13D5; Fri, 10 Dec 2021 07:14:21 -0800 (PST) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 5C4C63F5A1; Fri, 10 Dec 2021 07:14:20 -0800 (PST) From: Mark Rutland To: linux-arm-kernel@lists.infradead.org Cc: boqun.feng@gmail.com, catalin.marinas@arm.com, mark.rutland@arm.com, peterz@infradead.org, will@kernel.org Subject: [PATCH 3/5] arm64: atomics: lse: define ANDs in terms of ANDNOTs Date: Fri, 10 Dec 2021 15:14:08 +0000 Message-Id: <20211210151410.2782645-4-mark.rutland@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211210151410.2782645-1-mark.rutland@arm.com> References: <20211210151410.2782645-1-mark.rutland@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211210_071422_110084_7A04A6BC X-CRM114-Status: UNSURE ( 9.35 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The FEAT_LSE atomic instructions include atomic bit-clear instructions (`ldclr*` and `stclr*`) which can be used to directly implement ANDNOT operations. Each AND op is implemented as a copy of the corresponding ANDNOT op with a leading `mvn` instruction to apply a bitwise NOT to the `i` argument. As the compiler has no visibility of the `mvn`, this leads to less than optimal code generation when generating `i` into a register. For example, __lse_atomic_fetch_and(0xf, v) can be compiled to: mov w1, #0xf mvn w1, w1 ldclral w1, w1, [x2] This patch improves this by replacing the `mvn` with NOT in C before the inline assembly block, e.g. i = ~i; This allows the compiler to generate `i` into a register more optimally, e.g. mov w1, #0xfffffff0 ldclral w1, w1, [x2] With this change the assembly for each AND op is identical to the corresponding ANDNOT op (including barriers and clobbers), so I've removed the inline assembly and rewritten each AND op in terms of the corresponding ANDNOT op, e.g. | static inline void __lse_atomic_and(int i, atomic_t *v) | { | return __lse_atomic_andnot(~i, v); | } This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland Cc: Boqun Feng Cc: Catalin Marinas Cc: Peter Zijlstra Cc: Will Deacon Acked-by: Will Deacon --- arch/arm64/include/asm/atomic_lse.h | 34 ++++------------------------- 1 file changed, 4 insertions(+), 30 deletions(-) diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index 7454febb6d77..d707eafb7677 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -102,26 +102,13 @@ ATOMIC_OP_ADD_SUB_RETURN( , al, "memory") static inline void __lse_atomic_and(int i, atomic_t *v) { - asm volatile( - __LSE_PREAMBLE - " mvn %w[i], %w[i]\n" - " stclr %w[i], %[v]" - : [i] "+&r" (i), [v] "+Q" (v->counter) - : "r" (v)); + return __lse_atomic_andnot(~i, v); } #define ATOMIC_FETCH_OP_AND(name, mb, cl...) \ static inline int __lse_atomic_fetch_and##name(int i, atomic_t *v) \ { \ - asm volatile( \ - __LSE_PREAMBLE \ - " mvn %w[i], %w[i]\n" \ - " ldclr" #mb " %w[i], %w[i], %[v]" \ - : [i] "+&r" (i), [v] "+Q" (v->counter) \ - : "r" (v) \ - : cl); \ - \ - return i; \ + return __lse_atomic_fetch_andnot##name(~i, v); \ } ATOMIC_FETCH_OP_AND(_relaxed, ) @@ -223,26 +210,13 @@ ATOMIC64_OP_ADD_SUB_RETURN( , al, "memory") static inline void __lse_atomic64_and(s64 i, atomic64_t *v) { - asm volatile( - __LSE_PREAMBLE - " mvn %[i], %[i]\n" - " stclr %[i], %[v]" - : [i] "+&r" (i), [v] "+Q" (v->counter) - : "r" (v)); + return __lse_atomic64_andnot(~i, v); } #define ATOMIC64_FETCH_OP_AND(name, mb, cl...) \ static inline long __lse_atomic64_fetch_and##name(s64 i, atomic64_t *v) \ { \ - asm volatile( \ - __LSE_PREAMBLE \ - " mvn %[i], %[i]\n" \ - " ldclr" #mb " %[i], %[i], %[v]" \ - : [i] "+&r" (i), [v] "+Q" (v->counter) \ - : "r" (v) \ - : cl); \ - \ - return i; \ + return __lse_atomic64_fetch_andnot##name(~i, v); \ } ATOMIC64_FETCH_OP_AND(_relaxed, ) From patchwork Fri Dec 10 15:14:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Rutland X-Patchwork-Id: 12695613 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 818D8C433F5 for ; Fri, 10 Dec 2021 15:17:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=m4JijaPBsqv6BA2gyboZMd52IljozVOvKvG9BAho+8w=; b=I1oKwPen3bL3ae Bvw4JjxvYRoPllS8n86wjDVp/bzho3//zhIIFcS74FvpEou70JXyb0Ca6vuS+27E6RJAudJLtm9u0 TlqtHcpJYi/pOFTuiYXsjOXyYOxS9jjpOa+B3V5+ig6+qTFqDyugLfUaotmgt1MyDYbmUMO+1og9q WaP5OHQtoZiYoCjfzzJO8wvu3Z1UmYst/aInGvgC6aDLLHdag1Go4hSrdjTqv+5IXh03SnkqhppCw D+06YjQv0+kiA3thvc4y10gftad2UIx8VY6Nsw4v0K3usF4rYvbtc8MVvjefflh0A8o8DuODaUmjj FJgGK0zi/j+Aa6HyfeGA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mvhcD-002Laj-Ia; Fri, 10 Dec 2021 15:15:17 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mvhbL-002LIC-Ah for linux-arm-kernel@lists.infradead.org; Fri, 10 Dec 2021 15:14:27 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C2435142F; Fri, 10 Dec 2021 07:14:22 -0800 (PST) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id DD7043F5A1; Fri, 10 Dec 2021 07:14:21 -0800 (PST) From: Mark Rutland To: linux-arm-kernel@lists.infradead.org Cc: boqun.feng@gmail.com, catalin.marinas@arm.com, mark.rutland@arm.com, peterz@infradead.org, will@kernel.org Subject: [PATCH 4/5] arm64: atomics: lse: improve constraints for simple ops Date: Fri, 10 Dec 2021 15:14:09 +0000 Message-Id: <20211210151410.2782645-5-mark.rutland@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211210151410.2782645-1-mark.rutland@arm.com> References: <20211210151410.2782645-1-mark.rutland@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211210_071423_503699_879119BA X-CRM114-Status: GOOD ( 12.94 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org We have overly conservative assembly constraints for the basic FEAT_LSE atomic instructions, and using more accurate and permissive constraints will allow for better code generation. The FEAT_LSE basic atomic instructions have come in two forms: LD{op}{order}{size} , , [] ST{op}{order}{size} , [] The ST* forms are aliases of the LD* forms where: ST{op}{order}{size} , [] Is: LD{op}{order}{size} , XZR, [] For either form, both and are read but not written back to, and is written with the original value of the memory location. Where ( == ) or ( == ), is written *after* the other register value(s) are consumed. There are no UNPREDICTABLE or CONSTRAINED UNPREDICTABLE behaviours when any pair of , , or are the same register. Our current inline assembly always uses == , treating this register as both an input and an output (using a '+r' constraint). This forces the compiler to do some unnecessary register shuffling and/or redundant value generation. For example, the compiler cannot reuse the value, and currently GCC 11.1.0 will compile: __lse_atomic_add(1, a); __lse_atomic_add(1, b); __lse_atomic_add(1, c); As: mov w3, #0x1 mov w4, w3 stadd w4, [x0] mov w0, w3 stadd w0, [x1] stadd w3, [x2] We can improve this with more accurate constraints, separating and , where is an input-only register ('r'), and is an output-only value ('=r'). As is written back after is consumed, it does not need to be earlyclobber ('=&r'), leaving the compiler free to use the same register for both and where this is desirable. At the same time, the redundant 'r' constraint for `v` is removed, as the `+Q` constraint is sufficient. With this change, the above example becomes: mov w3, #0x1 stadd w3, [x0] stadd w3, [x1] stadd w3, [x2] I've made this change for the non-value-returning and FETCH ops. The RETURN ops have a multi-instruction sequence for which we cannot use the same constraints, and a subsequent patch will rewrite hte RETURN ops in terms of the FETCH ops, relying on the ability for the compiler to reuse the value. This is intended as an optimization. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland Cc: Boqun Feng Cc: Catalin Marinas Cc: Peter Zijlstra Cc: Will Deacon Acked-by: Will Deacon --- arch/arm64/include/asm/atomic_lse.h | 30 +++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index d707eafb7677..e4c5c4c34ce6 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -16,8 +16,8 @@ static inline void __lse_atomic_##op(int i, atomic_t *v) \ asm volatile( \ __LSE_PREAMBLE \ " " #asm_op " %w[i], %[v]\n" \ - : [i] "+r" (i), [v] "+Q" (v->counter) \ - : "r" (v)); \ + : [v] "+Q" (v->counter) \ + : [i] "r" (i)); \ } ATOMIC_OP(andnot, stclr) @@ -35,14 +35,17 @@ static inline void __lse_atomic_sub(int i, atomic_t *v) #define ATOMIC_FETCH_OP(name, mb, op, asm_op, cl...) \ static inline int __lse_atomic_fetch_##op##name(int i, atomic_t *v) \ { \ + int old; \ + \ asm volatile( \ __LSE_PREAMBLE \ - " " #asm_op #mb " %w[i], %w[i], %[v]" \ - : [i] "+r" (i), [v] "+Q" (v->counter) \ - : "r" (v) \ + " " #asm_op #mb " %w[i], %w[old], %[v]" \ + : [v] "+Q" (v->counter), \ + [old] "=r" (old) \ + : [i] "r" (i) \ : cl); \ \ - return i; \ + return old; \ } #define ATOMIC_FETCH_OPS(op, asm_op) \ @@ -124,8 +127,8 @@ static inline void __lse_atomic64_##op(s64 i, atomic64_t *v) \ asm volatile( \ __LSE_PREAMBLE \ " " #asm_op " %[i], %[v]\n" \ - : [i] "+r" (i), [v] "+Q" (v->counter) \ - : "r" (v)); \ + : [v] "+Q" (v->counter) \ + : [i] "r" (i)); \ } ATOMIC64_OP(andnot, stclr) @@ -143,14 +146,17 @@ static inline void __lse_atomic64_sub(s64 i, atomic64_t *v) #define ATOMIC64_FETCH_OP(name, mb, op, asm_op, cl...) \ static inline long __lse_atomic64_fetch_##op##name(s64 i, atomic64_t *v)\ { \ + s64 old; \ + \ asm volatile( \ __LSE_PREAMBLE \ - " " #asm_op #mb " %[i], %[i], %[v]" \ - : [i] "+r" (i), [v] "+Q" (v->counter) \ - : "r" (v) \ + " " #asm_op #mb " %[i], %[old], %[v]" \ + : [v] "+Q" (v->counter), \ + [old] "=r" (old) \ + : [i] "r" (i) \ : cl); \ \ - return i; \ + return old; \ } #define ATOMIC64_FETCH_OPS(op, asm_op) \ From patchwork Fri Dec 10 15:14:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Rutland X-Patchwork-Id: 12695614 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 43D38C433EF for ; Fri, 10 Dec 2021 15:17:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=MyUlbJgj5blefnc1yZH+v+LkgtB84y9NsraoXS07zpI=; b=JwPelqKvc6x2ns 2CTdq840xiGtMLnzCbaf1oiWw34n511iBPqQKRUUPo8ju0mA93SIanisOe0UkzhwySK+5ODGNEGtC HDqpORr+X/urS3tFgC93JFQCRzjPgWL2GtDW8am3ftPfKpNKXdIKMarq9RWZLII4URPK+1M5NtNzP xB9+VuBKv9/s5QqG/ZSjw4MNSFGLpb2YIWnzRwpyb/JJF3dQMFwVodfwRFMijGA782iwwBB4d0ivO kFsgpn6r6WLY51i//I1z4Nm0/WRFTl9caV/Yu08y0YgVkzX22Tu0wMCL/wVz/b99OUXqY4+U3MJjt kIGrRt7UkvglGSxKnkpQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mvhca-002Lly-IW; Fri, 10 Dec 2021 15:15:40 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mvhbM-002LJB-TG for linux-arm-kernel@lists.infradead.org; Fri, 10 Dec 2021 15:14:28 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 58078106F; Fri, 10 Dec 2021 07:14:24 -0800 (PST) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 72C483F5A1; Fri, 10 Dec 2021 07:14:23 -0800 (PST) From: Mark Rutland To: linux-arm-kernel@lists.infradead.org Cc: boqun.feng@gmail.com, catalin.marinas@arm.com, mark.rutland@arm.com, peterz@infradead.org, will@kernel.org Subject: [PATCH 5/5] arm64: atomics: lse: define RETURN ops in terms of FETCH ops Date: Fri, 10 Dec 2021 15:14:10 +0000 Message-Id: <20211210151410.2782645-6-mark.rutland@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211210151410.2782645-1-mark.rutland@arm.com> References: <20211210151410.2782645-1-mark.rutland@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211210_071425_116365_B7BF9584 X-CRM114-Status: GOOD ( 13.04 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The FEAT_LSE atomic instructions include LD* instructions which return the original value of a memory location can be used to directly implement FETCH opertations. Each RETURN op is implemented as a copy of the corresponding FETCH op with a trailing instruction instruction to generate the new value of the memory location. We only directly implement *_fetch_add*(), for which we have a trailing `add` instruction. As the compiler has no visibility of the `add`, this leads to less than optimal code generation when consuming the result. For example, the compiler cannot constant-fold the addition into later operations, and currently GCC 11.1.0 will compile: return __lse_atomic_sub_return(1, v) == 0; As: mov w1, #0xffffffff ldaddal w1, w2, [x0] add w1, w1, w2 cmp w1, #0x0 cset w0, eq // eq = none ret This patch improves this by replacing the `add` with C addition after the inline assembly block, e.g. ret += i; This allows the compiler to manipulate `i`. This permits the compiler to merge the `add` and `cmp` for the above, e.g. mov w1, #0xffffffff ldaddal w1, w1, [x0] cmp w1, #0x1 cset w0, eq // eq = none ret With this change the assembly for each RETURN op is identical to the corresponding FETCH op (including barriers and clobbers) so I've removed the inline assembly and rewritten each RETURN op in terms of the corresponding FETCH op, e.g. | static inline void __lse_atomic_add_return(int i, atomic_t *v) | { | return __lse_atomic_fetch_add(i, v) + i | } The new construction does not adversely affect the common case, and before and after this patch GCC 11.1.0 can compile: __lse_atomic_add_return(i, v) As: ldaddal w0, w2, [x1] add w0, w0, w2 ... while having the freedom to do better elsewhere. This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland Cc: Boqun Feng Cc: Catalin Marinas Cc: Peter Zijlstra Cc: Will Deacon Acked-by: Will Deacon --- arch/arm64/include/asm/atomic_lse.h | 48 +++++++++-------------------- 1 file changed, 14 insertions(+), 34 deletions(-) diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index e4c5c4c34ce6..d955ade5df7c 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -75,31 +75,21 @@ ATOMIC_FETCH_OP_SUB( ) #undef ATOMIC_FETCH_OP_SUB -#define ATOMIC_OP_ADD_SUB_RETURN(name, mb, cl...) \ +#define ATOMIC_OP_ADD_SUB_RETURN(name) \ static inline int __lse_atomic_add_return##name(int i, atomic_t *v) \ { \ - u32 tmp; \ - \ - asm volatile( \ - __LSE_PREAMBLE \ - " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \ - " add %w[i], %w[i], %w[tmp]" \ - : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \ - : "r" (v) \ - : cl); \ - \ - return i; \ + return __lse_atomic_fetch_add##name(i, v) + i; \ } \ \ static inline int __lse_atomic_sub_return##name(int i, atomic_t *v) \ { \ - return __lse_atomic_add_return##name(-i, v); \ + return __lse_atomic_fetch_sub(i, v) - i; \ } -ATOMIC_OP_ADD_SUB_RETURN(_relaxed, ) -ATOMIC_OP_ADD_SUB_RETURN(_acquire, a, "memory") -ATOMIC_OP_ADD_SUB_RETURN(_release, l, "memory") -ATOMIC_OP_ADD_SUB_RETURN( , al, "memory") +ATOMIC_OP_ADD_SUB_RETURN(_relaxed) +ATOMIC_OP_ADD_SUB_RETURN(_acquire) +ATOMIC_OP_ADD_SUB_RETURN(_release) +ATOMIC_OP_ADD_SUB_RETURN( ) #undef ATOMIC_OP_ADD_SUB_RETURN @@ -186,31 +176,21 @@ ATOMIC64_FETCH_OP_SUB( ) #undef ATOMIC64_FETCH_OP_SUB -#define ATOMIC64_OP_ADD_SUB_RETURN(name, mb, cl...) \ +#define ATOMIC64_OP_ADD_SUB_RETURN(name) \ static inline long __lse_atomic64_add_return##name(s64 i, atomic64_t *v)\ { \ - unsigned long tmp; \ - \ - asm volatile( \ - __LSE_PREAMBLE \ - " ldadd" #mb " %[i], %x[tmp], %[v]\n" \ - " add %[i], %[i], %x[tmp]" \ - : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \ - : "r" (v) \ - : cl); \ - \ - return i; \ + return __lse_atomic64_fetch_add##name(i, v) + i; \ } \ \ static inline long __lse_atomic64_sub_return##name(s64 i, atomic64_t *v)\ { \ - return __lse_atomic64_add_return##name(-i, v); \ + return __lse_atomic64_fetch_sub##name(i, v) - i; \ } -ATOMIC64_OP_ADD_SUB_RETURN(_relaxed, ) -ATOMIC64_OP_ADD_SUB_RETURN(_acquire, a, "memory") -ATOMIC64_OP_ADD_SUB_RETURN(_release, l, "memory") -ATOMIC64_OP_ADD_SUB_RETURN( , al, "memory") +ATOMIC64_OP_ADD_SUB_RETURN(_relaxed) +ATOMIC64_OP_ADD_SUB_RETURN(_acquire) +ATOMIC64_OP_ADD_SUB_RETURN(_release) +ATOMIC64_OP_ADD_SUB_RETURN( ) #undef ATOMIC64_OP_ADD_SUB_RETURN