From patchwork Fri Dec 10 15:14:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Rutland X-Patchwork-Id: 12695612 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A9D38C433F5 for ; Fri, 10 Dec 2021 15:16:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=OMVzFbLv4oL+PKzpWnn1ltCXypez3InaBh2c+DDb+/Y=; b=XeH3dzvlQoZafO avYY0Bl6480SolvTlDQj7+KjcYX0zLBN77yt6n687AaWeQHOPQmb5mhTBjSRUU9AI0xTflNtk2kMp IW1/G+1ExP0QdaQPjZ4eehOFSj9wiNGVnm+bpU6emNDek5VVb6VS1hW13AlQV/3t4PKZNnByN8QzW n4ES2nhgqROlqMXzZEih1hfzowmYTIhNB2G+XMrpNGLxA6FxCrm+vIcQLxK5tr9ispGHT0VlHL+PY zABjrISpJNkoi7vTLUu/Ep29Q5971SZ81ao9PFztO3j48/ga481N6i3StNGDjm+/jpXrEVblm5Kop m/904X6AgAmHMAOVHTHA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mvhbz-002LUR-PF; Fri, 10 Dec 2021 15:15:03 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mvhbJ-002LHU-TC for linux-arm-kernel@lists.infradead.org; Fri, 10 Dec 2021 15:14:26 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4155A13D5; Fri, 10 Dec 2021 07:14:21 -0800 (PST) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 5C4C63F5A1; Fri, 10 Dec 2021 07:14:20 -0800 (PST) From: Mark Rutland To: linux-arm-kernel@lists.infradead.org Cc: boqun.feng@gmail.com, catalin.marinas@arm.com, mark.rutland@arm.com, peterz@infradead.org, will@kernel.org Subject: [PATCH 3/5] arm64: atomics: lse: define ANDs in terms of ANDNOTs Date: Fri, 10 Dec 2021 15:14:08 +0000 Message-Id: <20211210151410.2782645-4-mark.rutland@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211210151410.2782645-1-mark.rutland@arm.com> References: <20211210151410.2782645-1-mark.rutland@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211210_071422_110084_7A04A6BC X-CRM114-Status: UNSURE ( 9.35 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The FEAT_LSE atomic instructions include atomic bit-clear instructions (`ldclr*` and `stclr*`) which can be used to directly implement ANDNOT operations. Each AND op is implemented as a copy of the corresponding ANDNOT op with a leading `mvn` instruction to apply a bitwise NOT to the `i` argument. As the compiler has no visibility of the `mvn`, this leads to less than optimal code generation when generating `i` into a register. For example, __lse_atomic_fetch_and(0xf, v) can be compiled to: mov w1, #0xf mvn w1, w1 ldclral w1, w1, [x2] This patch improves this by replacing the `mvn` with NOT in C before the inline assembly block, e.g. i = ~i; This allows the compiler to generate `i` into a register more optimally, e.g. mov w1, #0xfffffff0 ldclral w1, w1, [x2] With this change the assembly for each AND op is identical to the corresponding ANDNOT op (including barriers and clobbers), so I've removed the inline assembly and rewritten each AND op in terms of the corresponding ANDNOT op, e.g. | static inline void __lse_atomic_and(int i, atomic_t *v) | { | return __lse_atomic_andnot(~i, v); | } This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland Cc: Boqun Feng Cc: Catalin Marinas Cc: Peter Zijlstra Cc: Will Deacon Acked-by: Will Deacon --- arch/arm64/include/asm/atomic_lse.h | 34 ++++------------------------- 1 file changed, 4 insertions(+), 30 deletions(-) diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index 7454febb6d77..d707eafb7677 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -102,26 +102,13 @@ ATOMIC_OP_ADD_SUB_RETURN( , al, "memory") static inline void __lse_atomic_and(int i, atomic_t *v) { - asm volatile( - __LSE_PREAMBLE - " mvn %w[i], %w[i]\n" - " stclr %w[i], %[v]" - : [i] "+&r" (i), [v] "+Q" (v->counter) - : "r" (v)); + return __lse_atomic_andnot(~i, v); } #define ATOMIC_FETCH_OP_AND(name, mb, cl...) \ static inline int __lse_atomic_fetch_and##name(int i, atomic_t *v) \ { \ - asm volatile( \ - __LSE_PREAMBLE \ - " mvn %w[i], %w[i]\n" \ - " ldclr" #mb " %w[i], %w[i], %[v]" \ - : [i] "+&r" (i), [v] "+Q" (v->counter) \ - : "r" (v) \ - : cl); \ - \ - return i; \ + return __lse_atomic_fetch_andnot##name(~i, v); \ } ATOMIC_FETCH_OP_AND(_relaxed, ) @@ -223,26 +210,13 @@ ATOMIC64_OP_ADD_SUB_RETURN( , al, "memory") static inline void __lse_atomic64_and(s64 i, atomic64_t *v) { - asm volatile( - __LSE_PREAMBLE - " mvn %[i], %[i]\n" - " stclr %[i], %[v]" - : [i] "+&r" (i), [v] "+Q" (v->counter) - : "r" (v)); + return __lse_atomic64_andnot(~i, v); } #define ATOMIC64_FETCH_OP_AND(name, mb, cl...) \ static inline long __lse_atomic64_fetch_and##name(s64 i, atomic64_t *v) \ { \ - asm volatile( \ - __LSE_PREAMBLE \ - " mvn %[i], %[i]\n" \ - " ldclr" #mb " %[i], %[i], %[v]" \ - : [i] "+&r" (i), [v] "+Q" (v->counter) \ - : "r" (v) \ - : cl); \ - \ - return i; \ + return __lse_atomic64_fetch_andnot##name(~i, v); \ } ATOMIC64_FETCH_OP_AND(_relaxed, )