From patchwork Wed May 31 13:08:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13262297 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93145C7EE23 for ; Wed, 31 May 2023 13:28:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236199AbjEaN2U (ORCPT ); Wed, 31 May 2023 09:28:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47404 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231836AbjEaN2T (ORCPT ); Wed, 31 May 2023 09:28:19 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0145C0; Wed, 31 May 2023 06:28:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=YmtjcG8l219y7/fwmL50Z5SyZj2N/Dfw7Oi4KDcDI7Q=; b=EJ3ZB+xFh9CWEpg5yrG6OprVJc 4KY9jm91TcbrWOishlejJY5DbVF4mvlHugJmUDSsE/LkTcn4aim13LF+VunBmsPU6byCiTEoAXFtd PMY9zcKVyFfala+gu6tRVAh8pqQ01ufwI6rMYp7FCuX9C5mYnmZzPRiAM2anoIyZGo4A+qMj6+whs fEmcz0eNH94T1AESxYBFQUp4+DmILfWmyLz1Mx+MZvVIgDEzwKNwADbrfOqw3qVGm86SmNMGYmkIt gygTW9IHI214IRvW/eG2FY8W+9D4Iq6PvRg/jFdhrZleggC2VWlbR3HPvZmIIrMcseVeHlANgsyCs 01q0OpaA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1q4LrG-007IIk-36; Wed, 31 May 2023 13:27:22 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id F357830068D; Wed, 31 May 2023 15:27:16 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 7D554243B69D8; Wed, 31 May 2023 15:27:16 +0200 (CEST) Message-ID: <20230531132323.314826687@infradead.org> User-Agent: quilt/0.66 Date: Wed, 31 May 2023 15:08:34 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, sfr@canb.auug.org.au, mpe@ellerman.id.au, James.Bottomley@hansenpartnership.com, deller@gmx.de, linux-parisc@vger.kernel.org Subject: [PATCH 01/12] cyrpto/b128ops: Remove struct u128 References: <20230531130833.635651916@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org Per git-grep u128_xor() and its related struct u128 are unused except to implement {be,le}128_xor(). Remove them to free up the namespace. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Arnd Bergmann Acked-by: Herbert Xu --- include/crypto/b128ops.h | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-) --- a/include/crypto/b128ops.h +++ b/include/crypto/b128ops.h @@ -50,10 +50,6 @@ #include typedef struct { - u64 a, b; -} u128; - -typedef struct { __be64 a, b; } be128; @@ -61,20 +57,16 @@ typedef struct { __le64 b, a; } le128; -static inline void u128_xor(u128 *r, const u128 *p, const u128 *q) +static inline void be128_xor(be128 *r, const be128 *p, const be128 *q) { r->a = p->a ^ q->a; r->b = p->b ^ q->b; } -static inline void be128_xor(be128 *r, const be128 *p, const be128 *q) -{ - u128_xor((u128 *)r, (u128 *)p, (u128 *)q); -} - static inline void le128_xor(le128 *r, const le128 *p, const le128 *q) { - u128_xor((u128 *)r, (u128 *)p, (u128 *)q); + r->a = p->a ^ q->a; + r->b = p->b ^ q->b; } #endif /* _CRYPTO_B128OPS_H */ From patchwork Wed May 31 13:08:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13262301 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D55FC87FDD for ; Wed, 31 May 2023 13:28:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236402AbjEaN2Y (ORCPT ); Wed, 31 May 2023 09:28:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47436 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235358AbjEaN2U (ORCPT ); Wed, 31 May 2023 09:28:20 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B0F311F; Wed, 31 May 2023 06:28:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=De02UbLJVMu0IFxxAICiUX86I0+CYVgCoxMLmGl76l8=; b=qnjHbKiKPQHLXKyilOQKmohG2V GxBgsxD2zYySET8WbUC0G5aywathp8yHPni/OtulopKLf/OZwoJEOOexmOsH6Pm65LL3S6bqXg0nO j/0lmcJkKkIKaFw2sYNaRLJQ40H2FTEhwChB7YpLFfNOEim7WzUuwsGoEgtbevG7ocQ2aGO8BTCQl IW37BcTr7ZTB9XNV7eSjS06jPELWeyfqiNcQFr2zuwk/1zOc/QmbJPZsQmF7feuu9e0XkNvmjCoQP qDHHfVNgaD/t7GgN65ScBp7gQcmOAIJJynfNnxXzOcaC/oYSwaKixOLK+Qt00mpt8KpyTgbiYcEC2 PppYDfPw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1q4LrF-00FTCW-2I; Wed, 31 May 2023 13:27:21 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 018523006C4; Wed, 31 May 2023 15:27:16 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 80187243B69E7; Wed, 31 May 2023 15:27:16 +0200 (CEST) Message-ID: <20230531132323.385005581@infradead.org> User-Agent: quilt/0.66 Date: Wed, 31 May 2023 15:08:35 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, sfr@canb.auug.org.au, mpe@ellerman.id.au, James.Bottomley@hansenpartnership.com, deller@gmx.de, linux-parisc@vger.kernel.org Subject: [PATCH 02/12] types: Introduce [us]128 References: <20230531130833.635651916@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org Introduce [us]128 (when available). Unlike [us]64, ensure they are always naturally aligned. This also enables 128bit wide atomics (which require natural alignment) such as cmpxchg128(). Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Arnd Bergmann Acked-by: Herbert Xu --- include/linux/types.h | 5 +++++ include/uapi/linux/types.h | 4 ++++ lib/crypto/curve25519-hacl64.c | 2 -- lib/crypto/poly1305-donna64.c | 2 -- 4 files changed, 9 insertions(+), 4 deletions(-) --- a/include/linux/types.h +++ b/include/linux/types.h @@ -10,6 +10,11 @@ #define DECLARE_BITMAP(name,bits) \ unsigned long name[BITS_TO_LONGS(bits)] +#ifdef __SIZEOF_INT128__ +typedef __s128 s128; +typedef __u128 u128; +#endif + typedef u32 __kernel_dev_t; typedef __kernel_fd_set fd_set; --- a/include/uapi/linux/types.h +++ b/include/uapi/linux/types.h @@ -13,6 +13,10 @@ #include +#ifdef __SIZEOF_INT128__ +typedef __signed__ __int128 __s128 __attribute__((aligned(16))); +typedef unsigned __int128 __u128 __attribute__((aligned(16))); +#endif /* * Below are truly Linux-specific types that should never collide with --- a/lib/crypto/curve25519-hacl64.c +++ b/lib/crypto/curve25519-hacl64.c @@ -14,8 +14,6 @@ #include #include -typedef __uint128_t u128; - static __always_inline u64 u64_eq_mask(u64 a, u64 b) { u64 x = a ^ b; --- a/lib/crypto/poly1305-donna64.c +++ b/lib/crypto/poly1305-donna64.c @@ -10,8 +10,6 @@ #include #include -typedef __uint128_t u128; - void poly1305_core_setkey(struct poly1305_core_key *key, const u8 raw_key[POLY1305_BLOCK_SIZE]) { From patchwork Wed May 31 13:08:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13262303 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27BB0C87FE2 for ; Wed, 31 May 2023 13:28:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236432AbjEaN20 (ORCPT ); Wed, 31 May 2023 09:28:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236330AbjEaN2V (ORCPT ); Wed, 31 May 2023 09:28:21 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 760A612E; Wed, 31 May 2023 06:28:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=iOoLCL15Lkqusyj683d7QtsgXbn6vbNrDVP1qYk8vVg=; b=Q+4qUo/0gXEAXjPbpQyNIIqvEh kjEGNk0ljWZ9MChbj8TcT5wWJlI7fyolz8aXLp7afGfbtmKvuh5texHuol2FFPc/BFLMIGaaXVNYz n0EkQ+NXc8ix3CcraX2ZQxht3IbiyDSMoaaq+Qqm5a/EG/DDCNhXI73AgUJK3I4daoxH+dJBjWHyz eM9YvNvDd8eszZB1eMcUPBXuLuZLs1Zkp/PEFE3wYw/UvZpvTaa1O6ftlpoSXG4GLAv3P+xSm2e0j Pp8sfVF7Z5X/h6J93s1K8E6FQQ14zcNfW0R1yKU5I/1qxkQ2WBcdqjbPSucwee3zzwS6V/rIDS6Im DJYU1zvA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1q4LrF-00FTCX-2N; Wed, 31 May 2023 13:27:21 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id F0CD03003E1; Wed, 31 May 2023 15:27:16 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 83785243B69E8; Wed, 31 May 2023 15:27:16 +0200 (CEST) Message-ID: <20230531132323.452120708@infradead.org> User-Agent: quilt/0.66 Date: Wed, 31 May 2023 15:08:36 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, sfr@canb.auug.org.au, mpe@ellerman.id.au, James.Bottomley@hansenpartnership.com, deller@gmx.de, linux-parisc@vger.kernel.org Subject: [PATCH 03/12] arch: Introduce arch_{,try_}_cmpxchg128{,_local}() References: <20230531130833.635651916@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org For all architectures that currently support cmpxchg_double() implement the cmpxchg128() family of functions that is basically the same but with a saner interface. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Arnd Bergmann Acked-by: Heiko Carstens Acked-by: Mark Rutland --- arch/arm64/include/asm/atomic_ll_sc.h | 41 +++++++++++++++++++++ arch/arm64/include/asm/atomic_lse.h | 31 ++++++++++++++++ arch/arm64/include/asm/cmpxchg.h | 26 +++++++++++++ arch/s390/include/asm/cmpxchg.h | 14 +++++++ arch/x86/include/asm/cmpxchg_32.h | 3 + arch/x86/include/asm/cmpxchg_64.h | 64 +++++++++++++++++++++++++++++++++- 6 files changed, 177 insertions(+), 2 deletions(-) --- a/arch/arm64/include/asm/atomic_ll_sc.h +++ b/arch/arm64/include/asm/atomic_ll_sc.h @@ -326,6 +326,47 @@ __CMPXCHG_DBL( , , , ) __CMPXCHG_DBL(_mb, dmb ish, l, "memory") #undef __CMPXCHG_DBL + +union __u128_halves { + u128 full; + struct { + u64 low, high; + }; +}; + +#define __CMPXCHG128(name, mb, rel, cl...) \ +static __always_inline u128 \ +__ll_sc__cmpxchg128##name(volatile u128 *ptr, u128 old, u128 new) \ +{ \ + union __u128_halves r, o = { .full = (old) }, \ + n = { .full = (new) }; \ + unsigned int tmp; \ + \ + asm volatile("// __cmpxchg128" #name "\n" \ + " prfm pstl1strm, %[v]\n" \ + "1: ldxp %[rl], %[rh], %[v]\n" \ + " cmp %[rl], %[ol]\n" \ + " ccmp %[rh], %[oh], 0, eq\n" \ + " b.ne 2f\n" \ + " st" #rel "xp %w[tmp], %[nl], %[nh], %[v]\n" \ + " cbnz %w[tmp], 1b\n" \ + " " #mb "\n" \ + "2:" \ + : [v] "+Q" (*(u128 *)ptr), \ + [rl] "=&r" (r.low), [rh] "=&r" (r.high), \ + [tmp] "=&r" (tmp) \ + : [ol] "r" (o.low), [oh] "r" (o.high), \ + [nl] "r" (n.low), [nh] "r" (n.high) \ + : "cc", ##cl); \ + \ + return r.full; \ +} + +__CMPXCHG128( , , ) +__CMPXCHG128(_mb, dmb ish, l, "memory") + +#undef __CMPXCHG128 + #undef K #endif /* __ASM_ATOMIC_LL_SC_H */ --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -317,4 +317,35 @@ __CMPXCHG_DBL(_mb, al, "memory") #undef __CMPXCHG_DBL +#define __CMPXCHG128(name, mb, cl...) \ +static __always_inline u128 \ +__lse__cmpxchg128##name(volatile u128 *ptr, u128 old, u128 new) \ +{ \ + union __u128_halves r, o = { .full = (old) }, \ + n = { .full = (new) }; \ + register unsigned long x0 asm ("x0") = o.low; \ + register unsigned long x1 asm ("x1") = o.high; \ + register unsigned long x2 asm ("x2") = n.low; \ + register unsigned long x3 asm ("x3") = n.high; \ + register unsigned long x4 asm ("x4") = (unsigned long)ptr; \ + \ + asm volatile( \ + __LSE_PREAMBLE \ + " casp" #mb "\t%[old1], %[old2], %[new1], %[new2], %[v]\n"\ + : [old1] "+&r" (x0), [old2] "+&r" (x1), \ + [v] "+Q" (*(u128 *)ptr) \ + : [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), \ + [oldval1] "r" (o.low), [oldval2] "r" (o.high) \ + : cl); \ + \ + r.low = x0; r.high = x1; \ + \ + return r.full; \ +} + +__CMPXCHG128( , ) +__CMPXCHG128(_mb, al, "memory") + +#undef __CMPXCHG128 + #endif /* __ASM_ATOMIC_LSE_H */ --- a/arch/arm64/include/asm/cmpxchg.h +++ b/arch/arm64/include/asm/cmpxchg.h @@ -146,6 +146,19 @@ __CMPXCHG_DBL(_mb) #undef __CMPXCHG_DBL +#define __CMPXCHG128(name) \ +static inline u128 __cmpxchg128##name(volatile u128 *ptr, \ + u128 old, u128 new) \ +{ \ + return __lse_ll_sc_body(_cmpxchg128##name, \ + ptr, old, new); \ +} + +__CMPXCHG128( ) +__CMPXCHG128(_mb) + +#undef __CMPXCHG128 + #define __CMPXCHG_GEN(sfx) \ static __always_inline unsigned long __cmpxchg##sfx(volatile void *ptr, \ unsigned long old, \ @@ -228,6 +241,19 @@ __CMPXCHG_GEN(_mb) __ret; \ }) +/* cmpxchg128 */ +#define system_has_cmpxchg128() 1 + +#define arch_cmpxchg128(ptr, o, n) \ +({ \ + __cmpxchg128_mb((ptr), (o), (n)); \ +}) + +#define arch_cmpxchg128_local(ptr, o, n) \ +({ \ + __cmpxchg128((ptr), (o), (n)); \ +}) + #define __CMPWAIT_CASE(w, sfx, sz) \ static inline void __cmpwait_case_##sz(volatile void *ptr, \ unsigned long val) \ --- a/arch/s390/include/asm/cmpxchg.h +++ b/arch/s390/include/asm/cmpxchg.h @@ -224,4 +224,18 @@ static __always_inline int __cmpxchg_dou (unsigned long)(n1), (unsigned long)(n2)); \ }) +#define system_has_cmpxchg128() 1 + +static __always_inline u128 arch_cmpxchg128(volatile u128 *ptr, u128 old, u128 new) +{ + asm volatile( + " cdsg %[old],%[new],%[ptr]\n" + : [old] "+d" (old), [ptr] "+QS" (*ptr) + : [new] "d" (new) + : "memory", "cc"); + return old; +} + +#define arch_cmpxchg128 arch_cmpxchg128 + #endif /* __ASM_CMPXCHG_H */ --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -103,6 +103,7 @@ static inline bool __try_cmpxchg64(volat #endif -#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX8) +#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX8) +#define system_has_cmpxchg64() boot_cpu_has(X86_FEATURE_CX8) #endif /* _ASM_X86_CMPXCHG_32_H */ --- a/arch/x86/include/asm/cmpxchg_64.h +++ b/arch/x86/include/asm/cmpxchg_64.h @@ -20,6 +20,68 @@ arch_try_cmpxchg((ptr), (po), (n)); \ }) -#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX16) +union __u128_halves { + u128 full; + struct { + u64 low, high; + }; +}; + +#define __arch_cmpxchg128(_ptr, _old, _new, _lock) \ +({ \ + union __u128_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(_lock "cmpxchg16b %[ptr]" \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high) \ + : "memory"); \ + \ + o.full; \ +}) + +static __always_inline u128 arch_cmpxchg128(volatile u128 *ptr, u128 old, u128 new) +{ + return __arch_cmpxchg128(ptr, old, new, LOCK_PREFIX); +} + +static __always_inline u128 arch_cmpxchg128_local(volatile u128 *ptr, u128 old, u128 new) +{ + return __arch_cmpxchg128(ptr, old, new,); +} + +#define __arch_try_cmpxchg128(_ptr, _oldp, _new, _lock) \ +({ \ + union __u128_halves o = { .full = *(_oldp), }, \ + n = { .full = (_new), }; \ + bool ret; \ + \ + asm volatile(_lock "cmpxchg16b %[ptr]" \ + CC_SET(e) \ + : CC_OUT(e) (ret), \ + [ptr] "+m" (*ptr), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high) \ + : "memory"); \ + \ + if (unlikely(!ret)) \ + *(_oldp) = o.full; \ + \ + likely(ret); \ +}) + +static __always_inline bool arch_try_cmpxchg128(volatile u128 *ptr, u128 *oldp, u128 new) +{ + return __arch_try_cmpxchg128(ptr, oldp, new, LOCK_PREFIX); +} + +static __always_inline bool arch_try_cmpxchg128_local(volatile u128 *ptr, u128 *oldp, u128 new) +{ + return __arch_try_cmpxchg128(ptr, oldp, new,); +} + +#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX16) +#define system_has_cmpxchg128() boot_cpu_has(X86_FEATURE_CX16) #endif /* _ASM_X86_CMPXCHG_64_H */ From patchwork Wed May 31 13:08:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13262305 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AF46C7EE31 for ; Wed, 31 May 2023 13:28:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236455AbjEaN23 (ORCPT ); Wed, 31 May 2023 09:28:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236321AbjEaN2V (ORCPT ); Wed, 31 May 2023 09:28:21 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 025F3128; Wed, 31 May 2023 06:28:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=59NLGRWrtlYH4B9tBtjfHHRDJ8AYMaamGiEce66EFFM=; b=adU7qy7EtWqQ/9+jaTC6qBlJxL G8z124NAztw7qZZBS+WGAfy4TTNF5o52W6HPvaELFlgx3TI7FaXbEnhkgxoVixQWBL9AWJpJi2ghk fJWc42SnEir4no8Efxaw2AqvqkEl5jMLaxMY3OERvz+dYpkfOej/DVzAiDRrQ4x9u3isjhnuRIMk6 iMNoaxaXhsAnteNeakEMeSvIRWN1GOvn4ZXiEA1wfULaEG3y5PNlrVM3ALSrD9IY/VLPWv9pkYx6S Ke3+UYC7ua4b3+oPBoK4H6LhECBUNkv6gECK0i7Q/xuGBQM+J+xMvzKlghHoGenPWiSPuTw/f/Hih DMrFPG+A==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1q4LrG-007IIi-3B; Wed, 31 May 2023 13:27:22 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id F0C4E3002F0; Wed, 31 May 2023 15:27:16 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 89173243B856D; Wed, 31 May 2023 15:27:16 +0200 (CEST) Message-ID: <20230531132323.519237070@infradead.org> User-Agent: quilt/0.66 Date: Wed, 31 May 2023 15:08:37 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, sfr@canb.auug.org.au, mpe@ellerman.id.au, James.Bottomley@hansenpartnership.com, deller@gmx.de, linux-parisc@vger.kernel.org Subject: [PATCH 04/12] instrumentation: Wire up cmpxchg128() References: <20230531130833.635651916@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org Wire up the cmpxchg128 family in the atomic wrapper scripts. These provide the generic cmpxchg128 family of functions from the arch_ prefixed version, adding explicit instrumentation where needed. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Arnd Bergmann Acked-by: Mark Rutland --- include/linux/atomic/atomic-arch-fallback.h | 95 +++++++++++++++++++++++++++- include/linux/atomic/atomic-instrumented.h | 86 +++++++++++++++++++++++++ scripts/atomic/gen-atomic-fallback.sh | 4 - scripts/atomic/gen-atomic-instrumented.sh | 4 - 4 files changed, 183 insertions(+), 6 deletions(-) --- a/include/linux/atomic/atomic-arch-fallback.h +++ b/include/linux/atomic/atomic-arch-fallback.h @@ -77,6 +77,29 @@ #endif /* arch_cmpxchg64_relaxed */ +#ifndef arch_cmpxchg128_relaxed +#define arch_cmpxchg128_acquire arch_cmpxchg128 +#define arch_cmpxchg128_release arch_cmpxchg128 +#define arch_cmpxchg128_relaxed arch_cmpxchg128 +#else /* arch_cmpxchg128_relaxed */ + +#ifndef arch_cmpxchg128_acquire +#define arch_cmpxchg128_acquire(...) \ + __atomic_op_acquire(arch_cmpxchg128, __VA_ARGS__) +#endif + +#ifndef arch_cmpxchg128_release +#define arch_cmpxchg128_release(...) \ + __atomic_op_release(arch_cmpxchg128, __VA_ARGS__) +#endif + +#ifndef arch_cmpxchg128 +#define arch_cmpxchg128(...) \ + __atomic_op_fence(arch_cmpxchg128, __VA_ARGS__) +#endif + +#endif /* arch_cmpxchg128_relaxed */ + #ifndef arch_try_cmpxchg_relaxed #ifdef arch_try_cmpxchg #define arch_try_cmpxchg_acquire arch_try_cmpxchg @@ -217,6 +240,76 @@ #endif /* arch_try_cmpxchg64_relaxed */ +#ifndef arch_try_cmpxchg128_relaxed +#ifdef arch_try_cmpxchg128 +#define arch_try_cmpxchg128_acquire arch_try_cmpxchg128 +#define arch_try_cmpxchg128_release arch_try_cmpxchg128 +#define arch_try_cmpxchg128_relaxed arch_try_cmpxchg128 +#endif /* arch_try_cmpxchg128 */ + +#ifndef arch_try_cmpxchg128 +#define arch_try_cmpxchg128(_ptr, _oldp, _new) \ +({ \ + typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \ + ___r = arch_cmpxchg128((_ptr), ___o, (_new)); \ + if (unlikely(___r != ___o)) \ + *___op = ___r; \ + likely(___r == ___o); \ +}) +#endif /* arch_try_cmpxchg128 */ + +#ifndef arch_try_cmpxchg128_acquire +#define arch_try_cmpxchg128_acquire(_ptr, _oldp, _new) \ +({ \ + typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \ + ___r = arch_cmpxchg128_acquire((_ptr), ___o, (_new)); \ + if (unlikely(___r != ___o)) \ + *___op = ___r; \ + likely(___r == ___o); \ +}) +#endif /* arch_try_cmpxchg128_acquire */ + +#ifndef arch_try_cmpxchg128_release +#define arch_try_cmpxchg128_release(_ptr, _oldp, _new) \ +({ \ + typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \ + ___r = arch_cmpxchg128_release((_ptr), ___o, (_new)); \ + if (unlikely(___r != ___o)) \ + *___op = ___r; \ + likely(___r == ___o); \ +}) +#endif /* arch_try_cmpxchg128_release */ + +#ifndef arch_try_cmpxchg128_relaxed +#define arch_try_cmpxchg128_relaxed(_ptr, _oldp, _new) \ +({ \ + typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \ + ___r = arch_cmpxchg128_relaxed((_ptr), ___o, (_new)); \ + if (unlikely(___r != ___o)) \ + *___op = ___r; \ + likely(___r == ___o); \ +}) +#endif /* arch_try_cmpxchg128_relaxed */ + +#else /* arch_try_cmpxchg128_relaxed */ + +#ifndef arch_try_cmpxchg128_acquire +#define arch_try_cmpxchg128_acquire(...) \ + __atomic_op_acquire(arch_try_cmpxchg128, __VA_ARGS__) +#endif + +#ifndef arch_try_cmpxchg128_release +#define arch_try_cmpxchg128_release(...) \ + __atomic_op_release(arch_try_cmpxchg128, __VA_ARGS__) +#endif + +#ifndef arch_try_cmpxchg128 +#define arch_try_cmpxchg128(...) \ + __atomic_op_fence(arch_try_cmpxchg128, __VA_ARGS__) +#endif + +#endif /* arch_try_cmpxchg128_relaxed */ + #ifndef arch_try_cmpxchg_local #define arch_try_cmpxchg_local(_ptr, _oldp, _new) \ ({ \ @@ -2668,4 +2761,4 @@ arch_atomic64_dec_if_positive(atomic64_t #endif #endif /* _LINUX_ATOMIC_FALLBACK_H */ -// ad2e2b4d168dbc60a73922616047a9bfa446af36 +// 52dfc6fe4a2e7234bbd2aa3e16a377c1db793a53 --- a/include/linux/atomic/atomic-instrumented.h +++ b/include/linux/atomic/atomic-instrumented.h @@ -2034,6 +2034,36 @@ atomic_long_dec_if_positive(atomic_long_ arch_cmpxchg64_relaxed(__ai_ptr, __VA_ARGS__); \ }) +#define cmpxchg128(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + kcsan_mb(); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128(__ai_ptr, __VA_ARGS__); \ +}) + +#define cmpxchg128_acquire(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128_acquire(__ai_ptr, __VA_ARGS__); \ +}) + +#define cmpxchg128_release(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + kcsan_release(); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128_release(__ai_ptr, __VA_ARGS__); \ +}) + +#define cmpxchg128_relaxed(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128_relaxed(__ai_ptr, __VA_ARGS__); \ +}) + #define try_cmpxchg(ptr, oldp, ...) \ ({ \ typeof(ptr) __ai_ptr = (ptr); \ @@ -2110,6 +2140,44 @@ atomic_long_dec_if_positive(atomic_long_ arch_try_cmpxchg64_relaxed(__ai_ptr, __ai_oldp, __VA_ARGS__); \ }) +#define try_cmpxchg128(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + kcsan_mb(); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + +#define try_cmpxchg128_acquire(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128_acquire(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + +#define try_cmpxchg128_release(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + kcsan_release(); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128_release(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + +#define try_cmpxchg128_relaxed(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128_relaxed(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + #define cmpxchg_local(ptr, ...) \ ({ \ typeof(ptr) __ai_ptr = (ptr); \ @@ -2124,6 +2192,13 @@ atomic_long_dec_if_positive(atomic_long_ arch_cmpxchg64_local(__ai_ptr, __VA_ARGS__); \ }) +#define cmpxchg128_local(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128_local(__ai_ptr, __VA_ARGS__); \ +}) + #define sync_cmpxchg(ptr, ...) \ ({ \ typeof(ptr) __ai_ptr = (ptr); \ @@ -2150,6 +2225,15 @@ atomic_long_dec_if_positive(atomic_long_ arch_try_cmpxchg64_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \ }) +#define try_cmpxchg128_local(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + #define cmpxchg_double(ptr, ...) \ ({ \ typeof(ptr) __ai_ptr = (ptr); \ @@ -2167,4 +2251,4 @@ atomic_long_dec_if_positive(atomic_long_ }) #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */ -// 6b513a42e1a1b5962532a019b7fc91eaa044ad5e +// 82d1be694fab30414527d0877c29fa75ed5a0b74 --- a/scripts/atomic/gen-atomic-fallback.sh +++ b/scripts/atomic/gen-atomic-fallback.sh @@ -217,11 +217,11 @@ cat << EOF EOF -for xchg in "arch_xchg" "arch_cmpxchg" "arch_cmpxchg64"; do +for xchg in "arch_xchg" "arch_cmpxchg" "arch_cmpxchg64" "arch_cmpxchg128"; do gen_xchg_fallbacks "${xchg}" done -for cmpxchg in "cmpxchg" "cmpxchg64"; do +for cmpxchg in "cmpxchg" "cmpxchg64" "cmpxchg128"; do gen_try_cmpxchg_fallbacks "${cmpxchg}" done --- a/scripts/atomic/gen-atomic-instrumented.sh +++ b/scripts/atomic/gen-atomic-instrumented.sh @@ -166,14 +166,14 @@ grep '^[a-z]' "$1" | while read name met done -for xchg in "xchg" "cmpxchg" "cmpxchg64" "try_cmpxchg" "try_cmpxchg64"; do +for xchg in "xchg" "cmpxchg" "cmpxchg64" "cmpxchg128" "try_cmpxchg" "try_cmpxchg64" "try_cmpxchg128"; do for order in "" "_acquire" "_release" "_relaxed"; do gen_xchg "${xchg}" "${order}" "" printf "\n" done done -for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg" "try_cmpxchg_local" "try_cmpxchg64_local" ; do +for xchg in "cmpxchg_local" "cmpxchg64_local" "cmpxchg128_local" "sync_cmpxchg" "try_cmpxchg_local" "try_cmpxchg64_local" "try_cmpxchg128_local"; do gen_xchg "${xchg}" "" "" printf "\n" done From patchwork Wed May 31 13:08:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13262300 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5861DC7EE23 for ; Wed, 31 May 2023 13:28:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236384AbjEaN2X (ORCPT ); Wed, 31 May 2023 09:28:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47438 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236197AbjEaN2U (ORCPT ); Wed, 31 May 2023 09:28:20 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53082C5; Wed, 31 May 2023 06:28:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=0OKPBc3mt+ezfW70uBbauBPqh/QqcxZBd2GTGmmN3GA=; b=F56ze+VZ1f3mhyQBWTK1kZLCn6 DLlgkphQzhgklsVG742VjxiOdPtMB2tFaRRj4ss52ZIzg5HDYYF31EK4DXSi/DBDAeVFHjkmrUm0h ne6G2EAVd73HH115bh0qIhr/f9oQBI9cuDHH7XFY1y/+Q1FpTMH9h1VOfDocjZ5WggkAfzbIC/6lw /R1le7+B6S9xT4fJdC8c/xNvJw5kkcaR6ZulxEdI+j2JZO7YzfdcWmyiXFyEXXqo4vyOHwzhrYeMk 0nK10XnwacPBKHLyQBVTSJ62ZjRwUYdyes4r9vMpGHQ0Qmitpgvm5rzNtp4HxY1UbL2ndUxA94pCS aCNZwNaA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1q4LrG-007IJ5-DJ; Wed, 31 May 2023 13:27:22 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 55AF2300CB5; Wed, 31 May 2023 15:27:21 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 8BB1B243B8560; Wed, 31 May 2023 15:27:16 +0200 (CEST) Message-ID: <20230531132323.587480729@infradead.org> User-Agent: quilt/0.66 Date: Wed, 31 May 2023 15:08:38 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, sfr@canb.auug.org.au, mpe@ellerman.id.au, James.Bottomley@hansenpartnership.com, deller@gmx.de, linux-parisc@vger.kernel.org Subject: [PATCH 05/12] percpu: Add {raw,this}_cpu_try_cmpxchg() References: <20230531130833.635651916@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org Add the try_cmpxchg() form to the per-cpu ops. Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/percpu.h | 113 +++++++++++++++++++++++++++++++++++++++++-- include/linux/percpu-defs.h | 19 +++++++ 2 files changed, 128 insertions(+), 4 deletions(-) --- a/include/asm-generic/percpu.h +++ b/include/asm-generic/percpu.h @@ -89,16 +89,37 @@ do { \ __ret; \ }) -#define raw_cpu_generic_cmpxchg(pcp, oval, nval) \ +#define __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, _cmpxchg) \ +({ \ + typeof(pcp) __val, __old = *(ovalp); \ + __val = _cmpxchg(pcp, __old, nval); \ + if (__val != __old) \ + *(ovalp) = __val; \ + __val == __old; \ +}) + +#define raw_cpu_generic_try_cmpxchg(pcp, ovalp, nval) \ ({ \ typeof(pcp) *__p = raw_cpu_ptr(&(pcp)); \ - typeof(pcp) __ret; \ - __ret = *__p; \ - if (__ret == (oval)) \ + typeof(pcp) __val = *__p, __old = *(ovalp); \ + bool __ret; \ + if (__val == __old) { \ *__p = nval; \ + __ret = true; \ + } else { \ + *(ovalp) = __val; \ + __ret = false; \ + } \ __ret; \ }) +#define raw_cpu_generic_cmpxchg(pcp, oval, nval) \ +({ \ + typeof(pcp) __old = (oval); \ + raw_cpu_generic_try_cmpxchg(pcp, &__old, nval); \ + __old; \ +}) + #define raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ ({ \ typeof(pcp1) *__p1 = raw_cpu_ptr(&(pcp1)); \ @@ -170,6 +191,16 @@ do { \ __ret; \ }) +#define this_cpu_generic_try_cmpxchg(pcp, ovalp, nval) \ +({ \ + bool __ret; \ + unsigned long __flags; \ + raw_local_irq_save(__flags); \ + __ret = raw_cpu_generic_try_cmpxchg(pcp, ovalp, nval); \ + raw_local_irq_restore(__flags); \ + __ret; \ +}) + #define this_cpu_generic_cmpxchg(pcp, oval, nval) \ ({ \ typeof(pcp) __ret; \ @@ -282,6 +313,43 @@ do { \ #define raw_cpu_xchg_8(pcp, nval) raw_cpu_generic_xchg(pcp, nval) #endif +#ifndef raw_cpu_try_cmpxchg_1 +#ifdef raw_cpu_cmpxchg_1 +#define raw_cpu_try_cmpxchg_1(pcp, ovalp, nval) \ + __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, raw_cpu_cmpxchg_1) +#else +#define raw_cpu_try_cmpxchg_1(pcp, ovalp, nval) \ + raw_cpu_generic_try_cmpxchg(pcp, ovalp, nval) +#endif +#endif +#ifndef raw_cpu_try_cmpxchg_2 +#ifdef raw_cpu_cmpxchg_2 +#define raw_cpu_try_cmpxchg_2(pcp, ovalp, nval) \ + __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, raw_cpu_cmpxchg_2) +#else +#define raw_cpu_try_cmpxchg_2(pcp, ovalp, nval) \ + raw_cpu_generic_try_cmpxchg(pcp, ovalp, nval) +#endif +#endif +#ifndef raw_cpu_try_cmpxchg_4 +#ifdef raw_cpu_cmpxchg_4 +#define raw_cpu_try_cmpxchg_4(pcp, ovalp, nval) \ + __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, raw_cpu_cmpxchg_4) +#else +#define raw_cpu_try_cmpxchg_4(pcp, ovalp, nval) \ + raw_cpu_generic_try_cmpxchg(pcp, ovalp, nval) +#endif +#endif +#ifndef raw_cpu_try_cmpxchg_8 +#ifdef raw_cpu_cmpxchg_8 +#define raw_cpu_try_cmpxchg_8(pcp, ovalp, nval) \ + __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, raw_cpu_cmpxchg_8) +#else +#define raw_cpu_try_cmpxchg_8(pcp, ovalp, nval) \ + raw_cpu_generic_try_cmpxchg(pcp, ovalp, nval) +#endif +#endif + #ifndef raw_cpu_cmpxchg_1 #define raw_cpu_cmpxchg_1(pcp, oval, nval) \ raw_cpu_generic_cmpxchg(pcp, oval, nval) @@ -407,6 +475,43 @@ do { \ #define this_cpu_xchg_8(pcp, nval) this_cpu_generic_xchg(pcp, nval) #endif +#ifndef this_cpu_try_cmpxchg_1 +#ifdef this_cpu_cmpxchg_1 +#define this_cpu_try_cmpxchg_1(pcp, ovalp, nval) \ + __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, this_cpu_cmpxchg_1) +#else +#define this_cpu_try_cmpxchg_1(pcp, ovalp, nval) \ + this_cpu_generic_try_cmpxchg(pcp, ovalp, nval) +#endif +#endif +#ifndef this_cpu_try_cmpxchg_2 +#ifdef this_cpu_cmpxchg_2 +#define this_cpu_try_cmpxchg_2(pcp, ovalp, nval) \ + __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, this_cpu_cmpxchg_2) +#else +#define this_cpu_try_cmpxchg_2(pcp, ovalp, nval) \ + this_cpu_generic_try_cmpxchg(pcp, ovalp, nval) +#endif +#endif +#ifndef this_cpu_try_cmpxchg_4 +#ifdef this_cpu_cmpxchg_4 +#define this_cpu_try_cmpxchg_4(pcp, ovalp, nval) \ + __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, this_cpu_cmpxchg_4) +#else +#define this_cpu_try_cmpxchg_4(pcp, ovalp, nval) \ + this_cpu_generic_try_cmpxchg(pcp, ovalp, nval) +#endif +#endif +#ifndef this_cpu_try_cmpxchg_8 +#ifdef this_cpu_cmpxchg_8 +#define this_cpu_try_cmpxchg_8(pcp, ovalp, nval) \ + __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, this_cpu_cmpxchg_8) +#else +#define this_cpu_try_cmpxchg_8(pcp, ovalp, nval) \ + this_cpu_generic_try_cmpxchg(pcp, ovalp, nval) +#endif +#endif + #ifndef this_cpu_cmpxchg_1 #define this_cpu_cmpxchg_1(pcp, oval, nval) \ this_cpu_generic_cmpxchg(pcp, oval, nval) --- a/include/linux/percpu-defs.h +++ b/include/linux/percpu-defs.h @@ -343,6 +343,21 @@ static __always_inline void __this_cpu_p pscr2_ret__; \ }) +#define __pcpu_size_call_return2bool(stem, variable, ...) \ +({ \ + bool pscr2_ret__; \ + __verify_pcpu_ptr(&(variable)); \ + switch(sizeof(variable)) { \ + case 1: pscr2_ret__ = stem##1(variable, __VA_ARGS__); break; \ + case 2: pscr2_ret__ = stem##2(variable, __VA_ARGS__); break; \ + case 4: pscr2_ret__ = stem##4(variable, __VA_ARGS__); break; \ + case 8: pscr2_ret__ = stem##8(variable, __VA_ARGS__); break; \ + default: \ + __bad_size_call_parameter(); break; \ + } \ + pscr2_ret__; \ +}) + /* * Special handling for cmpxchg_double. cmpxchg_double is passed two * percpu variables. The first has to be aligned to a double word @@ -426,6 +441,8 @@ do { \ #define raw_cpu_xchg(pcp, nval) __pcpu_size_call_return2(raw_cpu_xchg_, pcp, nval) #define raw_cpu_cmpxchg(pcp, oval, nval) \ __pcpu_size_call_return2(raw_cpu_cmpxchg_, pcp, oval, nval) +#define raw_cpu_try_cmpxchg(pcp, ovalp, nval) \ + __pcpu_size_call_return2bool(raw_cpu_try_cmpxchg_, pcp, ovalp, nval) #define raw_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ __pcpu_double_call_return_bool(raw_cpu_cmpxchg_double_, pcp1, pcp2, oval1, oval2, nval1, nval2) @@ -513,6 +530,8 @@ do { \ #define this_cpu_xchg(pcp, nval) __pcpu_size_call_return2(this_cpu_xchg_, pcp, nval) #define this_cpu_cmpxchg(pcp, oval, nval) \ __pcpu_size_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) +#define this_cpu_try_cmpxchg(pcp, ovalp, nval) \ + __pcpu_size_call_return2bool(this_cpu_try_cmpxchg_, pcp, ovalp, nval) #define this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ __pcpu_double_call_return_bool(this_cpu_cmpxchg_double_, pcp1, pcp2, oval1, oval2, nval1, nval2) From patchwork Wed May 31 13:08:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13262307 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE6C7C7EE39 for ; Wed, 31 May 2023 13:28:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236485AbjEaN2e (ORCPT ); Wed, 31 May 2023 09:28:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236367AbjEaN2W (ORCPT ); Wed, 31 May 2023 09:28:22 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00005C0; Wed, 31 May 2023 06:28:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=Mf2DqF9+qmlB3aVgKTie/uCFSNlIvgxgU+h4eBbbVoE=; b=CWkbaVBPzKtpsr2a1a2hRuIWEs Y5SENCjjcw/5P2q7aD0YPbYmEcry3JvUsXvX2MYf6pIsJUqb4xvEWiVL/tnJHZJuX451RigPl5KMW YwMVH6eB6iOiwq0g+j7J4cB9lRQWnAXxpoNeE6u3/W6XOD6c2u6v02ClQGQsuCxWDFQGDLpgmyN60 QM3Mj7YF17Jk3on8SLT4R1wBDE62+16g+RASLcbdicYRSr/9AvauqZuLkzWIUYmGs+2amCY3YESy1 ZqlJ0GMxo7MyoJpGk0tIbeBwkiywbJX9YkqZLxrFdoUfwEBe7sCbBhHioUurtNYVpKH1IyJqfkupR TWIkrlBg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1q4LrG-00FTCb-1J; Wed, 31 May 2023 13:27:22 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 55A6630084B; Wed, 31 May 2023 15:27:21 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 901F5243B856E; Wed, 31 May 2023 15:27:16 +0200 (CEST) Message-ID: <20230531132323.654945124@infradead.org> User-Agent: quilt/0.66 Date: Wed, 31 May 2023 15:08:39 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, sfr@canb.auug.org.au, mpe@ellerman.id.au, James.Bottomley@hansenpartnership.com, deller@gmx.de, linux-parisc@vger.kernel.org Subject: [PATCH 06/12] percpu: Wire up cmpxchg128 References: <20230531130833.635651916@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org In order to replace cmpxchg_double() with the newly minted cmpxchg128() family of functions, wire it up in this_cpu_cmpxchg(). Signed-off-by: Peter Zijlstra (Intel) --- arch/arm64/include/asm/percpu.h | 20 ++++++++++ arch/s390/include/asm/percpu.h | 16 ++++++++ arch/x86/include/asm/percpu.h | 74 ++++++++++++++++++++++++++++++++++++---- arch/x86/lib/Makefile | 3 + arch/x86/lib/cmpxchg16b_emu.S | 43 +++++++++++++---------- arch/x86/lib/cmpxchg8b_emu.S | 67 ++++++++++++++++++++++++++++-------- include/asm-generic/percpu.h | 56 ++++++++++++++++++++++++++++++ 7 files changed, 240 insertions(+), 39 deletions(-) --- a/arch/arm64/include/asm/percpu.h +++ b/arch/arm64/include/asm/percpu.h @@ -140,6 +140,10 @@ PERCPU_RET_OP(add, add, ldadd) * re-enabling preemption for preemptible kernels, but doing that in a way * which builds inside a module would mean messing directly with the preempt * count. If you do this, peterz and tglx will hunt you down. + * + * Not to mention it'll break the actual preemption model for missing a + * preemption point when TIF_NEED_RESCHED gets set while preemption is + * disabled. */ #define this_cpu_cmpxchg_double_8(ptr1, ptr2, o1, o2, n1, n2) \ ({ \ @@ -240,6 +244,22 @@ PERCPU_RET_OP(add, add, ldadd) #define this_cpu_cmpxchg_8(pcp, o, n) \ _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) +#define this_cpu_cmpxchg64(pcp, o, n) this_cpu_cmpxchg_8(pcp, o, n) + +#define this_cpu_cmpxchg128(pcp, o, n) \ +({ \ + typedef typeof(pcp) pcp_op_T__; \ + u128 old__, new__, ret__; \ + pcp_op_T__ *ptr__; \ + old__ = o; \ + new__ = n; \ + preempt_disable_notrace(); \ + ptr__ = raw_cpu_ptr(&(pcp)); \ + ret__ = cmpxchg128_local((void *)ptr__, old__, new__); \ + preempt_enable_notrace(); \ + ret__; \ +}) + #ifdef __KVM_NVHE_HYPERVISOR__ extern unsigned long __hyp_per_cpu_offset(unsigned int cpu); #define __per_cpu_offset --- a/arch/s390/include/asm/percpu.h +++ b/arch/s390/include/asm/percpu.h @@ -148,6 +148,22 @@ #define this_cpu_cmpxchg_4(pcp, oval, nval) arch_this_cpu_cmpxchg(pcp, oval, nval) #define this_cpu_cmpxchg_8(pcp, oval, nval) arch_this_cpu_cmpxchg(pcp, oval, nval) +#define this_cpu_cmpxchg64(pcp, o, n) this_cpu_cmpxchg_8(pcp, o, n) + +#define this_cpu_cmpxchg128(pcp, oval, nval) \ +({ \ + typedef typeof(pcp) pcp_op_T__; \ + u128 old__, new__, ret__; \ + pcp_op_T__ *ptr__; \ + old__ = oval; \ + new__ = nval; \ + preempt_disable_notrace(); \ + ptr__ = raw_cpu_ptr(&(pcp)); \ + ret__ = cmpxchg128((void *)ptr__, old__, new__); \ + preempt_enable_notrace(); \ + ret__; \ +}) + #define arch_this_cpu_xchg(pcp, nval) \ ({ \ typeof(pcp) *ptr__; \ --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -210,6 +210,67 @@ do { \ (typeof(_var))(unsigned long) pco_old__; \ }) +#if defined(CONFIG_X86_32) && !defined(CONFIG_UML) +#define percpu_cmpxchg64_op(size, qual, _var, _oval, _nval) \ +({ \ + union { \ + u64 var; \ + struct { \ + u32 low, high; \ + }; \ + } old__, new__; \ + \ + old__.var = _oval; \ + new__.var = _nval; \ + \ + asm qual (ALTERNATIVE("leal %P[var], %%esi; call this_cpu_cmpxchg8b_emu", \ + "cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \ + : [var] "+m" (_var), \ + "+a" (old__.low), \ + "+d" (old__.high) \ + : "b" (new__.low), \ + "c" (new__.high) \ + : "memory", "esi"); \ + \ + old__.var; \ +}) + +#define raw_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg64_op(8, , pcp, oval, nval) +#define this_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg64_op(8, volatile, pcp, oval, nval) +#endif + +#ifdef CONFIG_X86_64 +#define raw_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg_op(8, , pcp, oval, nval); +#define this_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg_op(8, volatile, pcp, oval, nval); + +#define percpu_cmpxchg128_op(size, qual, _var, _oval, _nval) \ +({ \ + union { \ + u128 var; \ + struct { \ + u64 low, high; \ + }; \ + } old__, new__; \ + \ + old__.var = _oval; \ + new__.var = _nval; \ + \ + asm qual (ALTERNATIVE("leaq %P[var], %%rsi; call this_cpu_cmpxchg16b_emu", \ + "cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \ + : [var] "+m" (_var), \ + "+a" (old__.low), \ + "+d" (old__.high) \ + : "b" (new__.low), \ + "c" (new__.high) \ + : "memory", "rsi"); \ + \ + old__.var; \ +}) + +#define raw_cpu_cmpxchg128(pcp, oval, nval) percpu_cmpxchg128_op(16, , pcp, oval, nval) +#define this_cpu_cmpxchg128(pcp, oval, nval) percpu_cmpxchg128_op(16, volatile, pcp, oval, nval) +#endif + /* * this_cpu_read() makes gcc load the percpu variable every time it is * accessed while this_cpu_read_stable() allows the value to be cached. @@ -341,12 +402,13 @@ do { \ bool __ret; \ typeof(pcp1) __o1 = (o1), __n1 = (n1); \ typeof(pcp2) __o2 = (o2), __n2 = (n2); \ - alternative_io("leaq %P1,%%rsi\n\tcall this_cpu_cmpxchg16b_emu\n\t", \ - "cmpxchg16b " __percpu_arg(1) "\n\tsetz %0\n\t", \ - X86_FEATURE_CX16, \ - ASM_OUTPUT2("=a" (__ret), "+m" (pcp1), \ - "+m" (pcp2), "+d" (__o2)), \ - "b" (__n1), "c" (__n2), "a" (__o1) : "rsi"); \ + asm volatile (ALTERNATIVE("leaq %P1, %%rsi; call this_cpu_cmpxchg16b_emu", \ + "cmpxchg16b " __percpu_arg(1), X86_FEATURE_CX16) \ + "setz %0" \ + : "=a" (__ret), "+m" (pcp1) \ + : "b" (__n1), "c" (__n2), \ + "a" (__o1), "d" (__o2) \ + : "memory", "rsi"); \ __ret; \ }) --- a/arch/x86/lib/Makefile +++ b/arch/x86/lib/Makefile @@ -61,8 +61,9 @@ ifeq ($(CONFIG_X86_32),y) lib-y += strstr_32.o lib-y += string_32.o lib-y += memmove_32.o + lib-y += cmpxchg8b_emu.o ifneq ($(CONFIG_X86_CMPXCHG64),y) - lib-y += cmpxchg8b_emu.o atomic64_386_32.o + lib-y += atomic64_386_32.o endif else obj-y += iomap_copy_64.o --- a/arch/x86/lib/cmpxchg16b_emu.S +++ b/arch/x86/lib/cmpxchg16b_emu.S @@ -1,47 +1,54 @@ /* SPDX-License-Identifier: GPL-2.0-only */ #include #include +#include .text /* + * Emulate 'cmpxchg16b %gs:(%rsi)' + * * Inputs: * %rsi : memory location to compare * %rax : low 64 bits of old value * %rdx : high 64 bits of old value * %rbx : low 64 bits of new value * %rcx : high 64 bits of new value - * %al : Operation successful + * + * Notably this is not LOCK prefixed and is not safe against NMIs */ SYM_FUNC_START(this_cpu_cmpxchg16b_emu) -# -# Emulate 'cmpxchg16b %gs:(%rsi)' except we return the result in %al not -# via the ZF. Caller will access %al to get result. -# -# Note that this is only useful for a cpuops operation. Meaning that we -# do *not* have a fully atomic operation but just an operation that is -# *atomic* on a single cpu (as provided by the this_cpu_xx class of -# macros). -# pushfq cli - cmpq PER_CPU_VAR((%rsi)), %rax - jne .Lnot_same - cmpq PER_CPU_VAR(8(%rsi)), %rdx - jne .Lnot_same + /* if (*ptr == old) */ + cmpq PER_CPU_VAR(0(%rsi)), %rax + jne .Lnot_same + cmpq PER_CPU_VAR(8(%rsi)), %rdx + jne .Lnot_same + + /* *ptr = new */ + movq %rbx, PER_CPU_VAR(0(%rsi)) + movq %rcx, PER_CPU_VAR(8(%rsi)) - movq %rbx, PER_CPU_VAR((%rsi)) - movq %rcx, PER_CPU_VAR(8(%rsi)) + /* set ZF in EFLAGS to indicate success */ + orl $X86_EFLAGS_ZF, (%rsp) popfq - mov $1, %al RET .Lnot_same: + /* *ptr != old */ + + /* old = *ptr */ + movq PER_CPU_VAR(0(%rsi)), %rax + movq PER_CPU_VAR(8(%rsi)), %rdx + + /* clear ZF in EFLAGS to indicate failure */ + andl $(~X86_EFLAGS_ZF), (%rsp) + popfq - xor %al,%al RET SYM_FUNC_END(this_cpu_cmpxchg16b_emu) --- a/arch/x86/lib/cmpxchg8b_emu.S +++ b/arch/x86/lib/cmpxchg8b_emu.S @@ -2,10 +2,16 @@ #include #include +#include +#include .text +#ifndef CONFIG_X86_CMPXCHG64 + /* + * Emulate 'cmpxchg8b (%esi)' on UP + * * Inputs: * %esi : memory location to compare * %eax : low 32 bits of old value @@ -15,32 +21,65 @@ */ SYM_FUNC_START(cmpxchg8b_emu) -# -# Emulate 'cmpxchg8b (%esi)' on UP except we don't -# set the whole ZF thing (caller will just compare -# eax:edx with the expected value) -# pushfl cli - cmpl (%esi), %eax - jne .Lnot_same - cmpl 4(%esi), %edx - jne .Lhalf_same + cmpl 0(%esi), %eax + jne .Lnot_same + cmpl 4(%esi), %edx + jne .Lnot_same + + movl %ebx, 0(%esi) + movl %ecx, 4(%esi) - movl %ebx, (%esi) - movl %ecx, 4(%esi) + orl $X86_EFLAGS_ZF, (%esp) popfl RET .Lnot_same: - movl (%esi), %eax -.Lhalf_same: - movl 4(%esi), %edx + movl 0(%esi), %eax + movl 4(%esi), %edx + + andl $(~X86_EFLAGS_ZF), (%esp) popfl RET SYM_FUNC_END(cmpxchg8b_emu) EXPORT_SYMBOL(cmpxchg8b_emu) + +#endif + +#ifndef CONFIG_UML + +SYM_FUNC_START(this_cpu_cmpxchg8b_emu) + + pushfl + cli + + cmpl PER_CPU_VAR(0(%esi)), %eax + jne .Lnot_same2 + cmpl PER_CPU_VAR(4(%esi)), %edx + jne .Lnot_same2 + + movl %ebx, PER_CPU_VAR(0(%esi)) + movl %ecx, PER_CPU_VAR(4(%esi)) + + orl $X86_EFLAGS_ZF, (%esp) + + popfl + RET + +.Lnot_same2: + movl PER_CPU_VAR(0(%esi)), %eax + movl PER_CPU_VAR(4(%esi)), %edx + + andl $(~X86_EFLAGS_ZF), (%esp) + + popfl + RET + +SYM_FUNC_END(this_cpu_cmpxchg8b_emu) + +#endif --- a/include/asm-generic/percpu.h +++ b/include/asm-generic/percpu.h @@ -350,6 +350,25 @@ do { \ #endif #endif +#ifndef raw_cpu_try_cmpxchg64 +#ifdef raw_cpu_cmpxchg64 +#define raw_cpu_try_cmpxchg64(pcp, ovalp, nval) \ + __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, raw_cpu_cmpxchg64) +#else +#define raw_cpu_try_cmpxchg64(pcp, ovalp, nval) \ + raw_cpu_generic_try_cmpxchg(pcp, ovalp, nval) +#endif +#endif +#ifndef raw_cpu_try_cmpxchg128 +#ifdef raw_cpu_cmpxchg128 +#define raw_cpu_try_cmpxchg128(pcp, ovalp, nval) \ + __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, raw_cpu_cmpxchg128) +#else +#define raw_cpu_try_cmpxchg128(pcp, ovalp, nval) \ + raw_cpu_generic_try_cmpxchg(pcp, ovalp, nval) +#endif +#endif + #ifndef raw_cpu_cmpxchg_1 #define raw_cpu_cmpxchg_1(pcp, oval, nval) \ raw_cpu_generic_cmpxchg(pcp, oval, nval) @@ -367,6 +386,15 @@ do { \ raw_cpu_generic_cmpxchg(pcp, oval, nval) #endif +#ifndef raw_cpu_cmpxchg64 +#define raw_cpu_cmpxchg64(pcp, oval, nval) \ + raw_cpu_generic_cmpxchg(pcp, oval, nval) +#endif +#ifndef raw_cpu_cmpxchg128 +#define raw_cpu_cmpxchg128(pcp, oval, nval) \ + raw_cpu_generic_cmpxchg(pcp, oval, nval) +#endif + #ifndef raw_cpu_cmpxchg_double_1 #define raw_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) @@ -512,6 +540,25 @@ do { \ #endif #endif +#ifndef this_cpu_try_cmpxchg64 +#ifdef this_cpu_cmpxchg64 +#define this_cpu_try_cmpxchg64(pcp, ovalp, nval) \ + __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, this_cpu_cmpxchg64) +#else +#define this_cpu_try_cmpxchg64(pcp, ovalp, nval) \ + this_cpu_generic_try_cmpxchg(pcp, ovalp, nval) +#endif +#endif +#ifndef this_cpu_try_cmpxchg128 +#ifdef this_cpu_cmpxchg128 +#define this_cpu_try_cmpxchg128(pcp, ovalp, nval) \ + __cpu_fallback_try_cmpxchg(pcp, ovalp, nval, this_cpu_cmpxchg128) +#else +#define this_cpu_try_cmpxchg128(pcp, ovalp, nval) \ + this_cpu_generic_try_cmpxchg(pcp, ovalp, nval) +#endif +#endif + #ifndef this_cpu_cmpxchg_1 #define this_cpu_cmpxchg_1(pcp, oval, nval) \ this_cpu_generic_cmpxchg(pcp, oval, nval) @@ -529,6 +576,15 @@ do { \ this_cpu_generic_cmpxchg(pcp, oval, nval) #endif +#ifndef this_cpu_cmpxchg64 +#define this_cpu_cmpxchg64(pcp, oval, nval) \ + this_cpu_generic_cmpxchg(pcp, oval, nval) +#endif +#ifndef this_cpu_cmpxchg128 +#define this_cpu_cmpxchg128(pcp, oval, nval) \ + this_cpu_generic_cmpxchg(pcp, oval, nval) +#endif + #ifndef this_cpu_cmpxchg_double_1 #define this_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) From patchwork Wed May 31 13:08:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13262306 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8A42C7EE33 for ; Wed, 31 May 2023 13:28:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236471AbjEaN2c (ORCPT ); Wed, 31 May 2023 09:28:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236347AbjEaN2V (ORCPT ); Wed, 31 May 2023 09:28:21 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26E48123; Wed, 31 May 2023 06:28:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=OYq3F8aZwPPC/Y8UsexFdal3aiucmNsqQObmCMz9zjE=; b=LIi2st653mrq+qr82jZjSZoSMu b4IdGvoFjlZSfW2/JQDfIQgkhRmizDc+3i69fP7F75A4WAkEYw/JPtYzsJYXvDjOlO3lYqV3CTQG2 W5QHSm3lF3LRIXC6DORpHYE0GY/d/aV+oN7w/SrBOka2MUxnFUNUBPn6uIzvLTkgZCknrMs9/URJT jHAuSX05RfqOvsyIyF/kFoW7hbFpLP/c+WXZHzxrpS37v59bTdQPKeQCwIDOzbrtY2xMJrtSDtp8c jQSyhSMxk2M6L9+FSzaQWKUsmX61W6UzXzdaMQ0wKEbO8TRWRg4MO80O6dbE/9Vv3BMgIz9xTSMYB XK7W6Mcg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1q4LrH-007IJW-3w; Wed, 31 May 2023 13:27:23 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 55A95300C0E; Wed, 31 May 2023 15:27:21 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 95BD0243B856F; Wed, 31 May 2023 15:27:16 +0200 (CEST) Message-ID: <20230531132323.722039569@infradead.org> User-Agent: quilt/0.66 Date: Wed, 31 May 2023 15:08:40 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, sfr@canb.auug.org.au, mpe@ellerman.id.au, James.Bottomley@hansenpartnership.com, deller@gmx.de, linux-parisc@vger.kernel.org Subject: [PATCH 07/12] percpu: #ifndef __SIZEOF_INT128__ References: <20230531130833.635651916@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org Some 64bit architectures do not advertise __SIZEOF_INT128__ on all supported compiler versions. Notably the HPPA64 only started doing with GCC-11. Since the per-cpu ops are universally availably, and this_cpu_{,try_}cmpxchg128() is expected to be available on all 64bit architectures a wee bodge is in order. Sadly, while C reverts to memcpy() for assignment of POD types, it does not revert to memcmp() for for equality. Therefore frob that manually. Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/percpu.h | 56 +++++++++++++++++++++++++++++++++++++++++++ include/linux/types.h | 7 +++++ 2 files changed, 63 insertions(+) --- a/include/asm-generic/percpu.h +++ b/include/asm-generic/percpu.h @@ -313,6 +313,35 @@ do { \ #define raw_cpu_xchg_8(pcp, nval) raw_cpu_generic_xchg(pcp, nval) #endif +#ifndef __SIZEOF_INT128__ +#define raw_cpu_generic_try_cmpxchg_memcmp(pcp, ovalp, nval) \ +({ \ + typeof(pcp) *__p = raw_cpu_ptr(&(pcp)); \ + typeof(pcp) __val = *__p, __old = *(ovalp); \ + bool __ret; \ + if (!__builtin_memcmp(&__val, &__old, sizeof(pcp))) { \ + *__p = nval; \ + __ret = true; \ + } else { \ + *(ovalp) = __val; \ + __ret = false; \ + } \ + __ret; \ +}) + +#define raw_cpu_generic_cmpxchg_memcmp(pcp, oval, nval) \ +({ \ + typeof(pcp) __old = (oval); \ + raw_cpu_generic_try_cmpxchg_memcpy(pcp, &__old, nval); \ + __old; \ +}) + +#define raw_cpu_cmpxchg128(pcp, oval, nval) \ + raw_cpu_generic_cmpxchg_memcmp(pcp, oval, nval) +#define raw_cpu_try_cmpxchg128(pcp, ovalp, nval) \ + raw_cpu_generic_try_cmpxchg_memcmp(pcp, ovalp, nval) +#endif + #ifndef raw_cpu_try_cmpxchg_1 #ifdef raw_cpu_cmpxchg_1 #define raw_cpu_try_cmpxchg_1(pcp, ovalp, nval) \ @@ -503,6 +532,33 @@ do { \ #define this_cpu_xchg_8(pcp, nval) this_cpu_generic_xchg(pcp, nval) #endif +#ifndef __SIZEOF_INT128__ +#define this_cpu_generic_try_cmpxchg_memcmp(pcp, ovalp, nval) \ +({ \ + bool __ret; \ + unsigned long __flags; \ + raw_local_irq_save(__flags); \ + __ret = raw_cpu_generic_try_cmpxchg_memcmp(pcp, ovalp, nval); \ + raw_local_irq_restore(__flags); \ + __ret; \ +}) + +#define this_cpu_generic_cmpxchg_memcmp(pcp, oval, nval) \ +({ \ + typeof(pcp) __ret; \ + unsigned long __flags; \ + raw_local_irq_save(__flags); \ + __ret = raw_cpu_generic_cmpxchg_memcmp(pcp, oval, nval); \ + raw_local_irq_restore(__flags); \ + __ret; \ +}) + +#define this_cpu_cmpxchg128(pcp, oval, nval) \ + this_cpu_generic_cmpxchg_memcmp(pcp, oval, nval) +#define this_cpu_try_cmpxchg128(pcp, ovalp, nval) \ + this_cpu_generic_try_cmpxchg_memcmp(pcp, ovalp, nval) +#endif + #ifndef this_cpu_try_cmpxchg_1 #ifdef this_cpu_cmpxchg_1 #define this_cpu_try_cmpxchg_1(pcp, ovalp, nval) \ --- a/include/linux/types.h +++ b/include/linux/types.h @@ -13,6 +13,13 @@ #ifdef __SIZEOF_INT128__ typedef __s128 s128; typedef __u128 u128; +#else +#ifdef CONFIG_64BIT +/* hack for this_cpu_cmpxchg128 */ +typedef struct { + u64 a, b; +} u128 __attribute__((aligned(16))); +#endif #endif typedef u32 __kernel_dev_t; From patchwork Wed May 31 13:08:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13262298 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E3A5C7EE37 for ; Wed, 31 May 2023 13:28:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236339AbjEaN2V (ORCPT ); Wed, 31 May 2023 09:28:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234861AbjEaN2U (ORCPT ); Wed, 31 May 2023 09:28:20 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69925101; Wed, 31 May 2023 06:28:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=EFRjNdGkKmPs8t8EQjORBoOhY59fsy+w5h28GAZsB6Q=; b=UtVwkPrM/XC85pGt7r9J/NNpRu UL+91AuyLHvjrvG49pQLde3FfNNUmo1htXYmBdQi6zGavV3SSWRcdYYRg2pT/RFAN1s0beaKp6uIt qmsCYWl48JdxSnUcl0TZnP/DxO6Las7rvju+4B6BHlugr3GN/O1xwkrFbvEm1TcgBc0nFb02d74Q+ FugUvUwlkjaq40KU1RNfTvbPliEJmyGnO7N9Y8FQ1QTQ6NKNxRxIvQbmh4ObO5y3tSWG6QFsZealY U6f5lhw6Luo8Kph7RKkJJmtoJB0hFyStxCRmgm2iCzUziTKBMxS0MFEMxeMLjZEeDlsmuqhPZCMlC fkpnhqUg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1q4LrG-00FTCc-1L; Wed, 31 May 2023 13:27:22 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 55AC1300C4B; Wed, 31 May 2023 15:27:21 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 992BE243B8570; Wed, 31 May 2023 15:27:16 +0200 (CEST) Message-ID: <20230531132323.788955257@infradead.org> User-Agent: quilt/0.66 Date: Wed, 31 May 2023 15:08:41 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, sfr@canb.auug.org.au, mpe@ellerman.id.au, James.Bottomley@hansenpartnership.com, deller@gmx.de, linux-parisc@vger.kernel.org, Vasant Hegde Subject: [PATCH 08/12] x86,amd_iommu: Replace cmpxchg_double() References: <20230531130833.635651916@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Arnd Bergmann Tested-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 9 +++++++-- drivers/iommu/amd/iommu.c | 10 ++++------ 2 files changed, 11 insertions(+), 8 deletions(-) --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -986,8 +986,13 @@ union irte_ga_hi { }; struct irte_ga { - union irte_ga_lo lo; - union irte_ga_hi hi; + union { + struct { + union irte_ga_lo lo; + union irte_ga_hi hi; + }; + u128 irte; + }; }; struct irq_2_irte { --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3003,10 +3003,10 @@ static int alloc_irq_index(struct amd_io static int modify_irte_ga(struct amd_iommu *iommu, u16 devid, int index, struct irte_ga *irte, struct amd_ir_data *data) { - bool ret; struct irq_remap_table *table; - unsigned long flags; struct irte_ga *entry; + unsigned long flags; + u128 old; table = get_irq_table(iommu, devid); if (!table) @@ -3017,16 +3017,14 @@ static int modify_irte_ga(struct amd_iom entry = (struct irte_ga *)table->table; entry = &entry[index]; - ret = cmpxchg_double(&entry->lo.val, &entry->hi.val, - entry->lo.val, entry->hi.val, - irte->lo.val, irte->hi.val); /* * We use cmpxchg16 to atomically update the 128-bit IRTE, * and it cannot be updated by the hardware or other processors * behind us, so the return value of cmpxchg16 should be the * same as the old value. */ - WARN_ON(!ret); + old = entry->irte; + WARN_ON(!try_cmpxchg128(&entry->irte, &old, irte->irte)); if (data) data->ref = entry; From patchwork Wed May 31 13:08:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13262299 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2839C7EE3A for ; Wed, 31 May 2023 13:28:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236379AbjEaN2W (ORCPT ); Wed, 31 May 2023 09:28:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47434 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235287AbjEaN2U (ORCPT ); Wed, 31 May 2023 09:28:20 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A86310E; Wed, 31 May 2023 06:28:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=oEM9cKMOhay2aIvdrV8l4iv3Y85XNhmkKjhGaIIXx9s=; b=HwjGAHT5MrE1xu4SBX93HkmX9u a2znWTmLZ07JD0sUzV+0IbVtXOD/6RBCRGFsNejvnCtTI/amSgZ5UZ4/YfCJGHaxijoudZ1oYAVKW uqofk9TNQ3evFFh/rxMp+rU18zbEUWrANyQO39RE82BwGYNqD3ZPhacASMa6BwWLixxGUTLxqd8Vi d1YqKT1tBIGCET2CoaFNyLZHd92jwsvnYwiD1nC7ltau7JQ4E8QzuEDIZNUQGqdqQXo+32DWDjxVT 5m52fmU5Y+Z3xNFs0Ud9NdY2ZSjk0KJZeTseHj/XGzXMWFdlSANtg+aZnACSKvKlZR9CBvnkLYmgn dJcqgM4g==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1q4LrG-00FTCd-1R; Wed, 31 May 2023 13:27:22 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 55A463002A9; Wed, 31 May 2023 15:27:21 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 9E8FE243B8572; Wed, 31 May 2023 15:27:16 +0200 (CEST) Message-ID: <20230531132323.855976804@infradead.org> User-Agent: quilt/0.66 Date: Wed, 31 May 2023 15:08:42 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, sfr@canb.auug.org.au, mpe@ellerman.id.au, James.Bottomley@hansenpartnership.com, deller@gmx.de, linux-parisc@vger.kernel.org Subject: [PATCH 09/12] x86,intel_iommu: Replace cmpxchg_double() References: <20230531130833.635651916@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Lu Baolu Reviewed-by: Arnd Bergmann --- drivers/iommu/intel/irq_remapping.c | 8 -- include/linux/dmar.h | 125 +++++++++++++++++++----------------- 2 files changed, 68 insertions(+), 65 deletions(-) --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -175,18 +175,14 @@ static int modify_irte(struct irq_2_iomm irte = &iommu->ir_table->base[index]; if ((irte->pst == 1) || (irte_modified->pst == 1)) { - bool ret; - - ret = cmpxchg_double(&irte->low, &irte->high, - irte->low, irte->high, - irte_modified->low, irte_modified->high); /* * We use cmpxchg16 to atomically update the 128-bit IRTE, * and it cannot be updated by the hardware or other processors * behind us, so the return value of cmpxchg16 should be the * same as the old value. */ - WARN_ON(!ret); + u128 old = irte->irte; + WARN_ON(!try_cmpxchg128(&irte->irte, &old, irte_modified->irte)); } else { WRITE_ONCE(irte->low, irte_modified->low); WRITE_ONCE(irte->high, irte_modified->high); --- a/include/linux/dmar.h +++ b/include/linux/dmar.h @@ -202,67 +202,74 @@ static inline void detect_intel_iommu(vo struct irte { union { - /* Shared between remapped and posted mode*/ struct { - __u64 present : 1, /* 0 */ - fpd : 1, /* 1 */ - __res0 : 6, /* 2 - 6 */ - avail : 4, /* 8 - 11 */ - __res1 : 3, /* 12 - 14 */ - pst : 1, /* 15 */ - vector : 8, /* 16 - 23 */ - __res2 : 40; /* 24 - 63 */ + union { + /* Shared between remapped and posted mode*/ + struct { + __u64 present : 1, /* 0 */ + fpd : 1, /* 1 */ + __res0 : 6, /* 2 - 6 */ + avail : 4, /* 8 - 11 */ + __res1 : 3, /* 12 - 14 */ + pst : 1, /* 15 */ + vector : 8, /* 16 - 23 */ + __res2 : 40; /* 24 - 63 */ + }; + + /* Remapped mode */ + struct { + __u64 r_present : 1, /* 0 */ + r_fpd : 1, /* 1 */ + dst_mode : 1, /* 2 */ + redir_hint : 1, /* 3 */ + trigger_mode : 1, /* 4 */ + dlvry_mode : 3, /* 5 - 7 */ + r_avail : 4, /* 8 - 11 */ + r_res0 : 4, /* 12 - 15 */ + r_vector : 8, /* 16 - 23 */ + r_res1 : 8, /* 24 - 31 */ + dest_id : 32; /* 32 - 63 */ + }; + + /* Posted mode */ + struct { + __u64 p_present : 1, /* 0 */ + p_fpd : 1, /* 1 */ + p_res0 : 6, /* 2 - 7 */ + p_avail : 4, /* 8 - 11 */ + p_res1 : 2, /* 12 - 13 */ + p_urgent : 1, /* 14 */ + p_pst : 1, /* 15 */ + p_vector : 8, /* 16 - 23 */ + p_res2 : 14, /* 24 - 37 */ + pda_l : 26; /* 38 - 63 */ + }; + __u64 low; + }; + + union { + /* Shared between remapped and posted mode*/ + struct { + __u64 sid : 16, /* 64 - 79 */ + sq : 2, /* 80 - 81 */ + svt : 2, /* 82 - 83 */ + __res3 : 44; /* 84 - 127 */ + }; + + /* Posted mode*/ + struct { + __u64 p_sid : 16, /* 64 - 79 */ + p_sq : 2, /* 80 - 81 */ + p_svt : 2, /* 82 - 83 */ + p_res3 : 12, /* 84 - 95 */ + pda_h : 32; /* 96 - 127 */ + }; + __u64 high; + }; }; - - /* Remapped mode */ - struct { - __u64 r_present : 1, /* 0 */ - r_fpd : 1, /* 1 */ - dst_mode : 1, /* 2 */ - redir_hint : 1, /* 3 */ - trigger_mode : 1, /* 4 */ - dlvry_mode : 3, /* 5 - 7 */ - r_avail : 4, /* 8 - 11 */ - r_res0 : 4, /* 12 - 15 */ - r_vector : 8, /* 16 - 23 */ - r_res1 : 8, /* 24 - 31 */ - dest_id : 32; /* 32 - 63 */ - }; - - /* Posted mode */ - struct { - __u64 p_present : 1, /* 0 */ - p_fpd : 1, /* 1 */ - p_res0 : 6, /* 2 - 7 */ - p_avail : 4, /* 8 - 11 */ - p_res1 : 2, /* 12 - 13 */ - p_urgent : 1, /* 14 */ - p_pst : 1, /* 15 */ - p_vector : 8, /* 16 - 23 */ - p_res2 : 14, /* 24 - 37 */ - pda_l : 26; /* 38 - 63 */ - }; - __u64 low; - }; - - union { - /* Shared between remapped and posted mode*/ - struct { - __u64 sid : 16, /* 64 - 79 */ - sq : 2, /* 80 - 81 */ - svt : 2, /* 82 - 83 */ - __res3 : 44; /* 84 - 127 */ - }; - - /* Posted mode*/ - struct { - __u64 p_sid : 16, /* 64 - 79 */ - p_sq : 2, /* 80 - 81 */ - p_svt : 2, /* 82 - 83 */ - p_res3 : 12, /* 84 - 95 */ - pda_h : 32; /* 96 - 127 */ - }; - __u64 high; +#ifdef CONFIG_IRQ_REMAP + __u128 irte; +#endif }; }; From patchwork Wed May 31 13:08:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13262308 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CCA6C83003 for ; Wed, 31 May 2023 13:28:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236493AbjEaN2g (ORCPT ); Wed, 31 May 2023 09:28:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236359AbjEaN2W (ORCPT ); Wed, 31 May 2023 09:28:22 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5878B129; Wed, 31 May 2023 06:28:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=yL1qo/vb48QdPJDHPVkvS1Zq4+B/eXJGNt78k1EwTE8=; b=DW3puokgt2Sigksg/ERQDTbI5F X9p3E/IslASCuC3vUaf+7HAUpBufw+aylakDJDZNkblCqjUoUSTZWxlRcwf5yMkWCMD8tlgdafG0L 47hUgse3tDg8ztseLhxGtEY8y1bECyX246uQf/3WZj3YyP2oEpMKWBM1NEOVA8CFpM3l5FCrI6Cdz n9cfpBSSywRhj6cZ5sUK91yTNWQAOutrtz0Ob/tcDLU+XF2taTda7oKkgDE7N4oNlrp15ideTU32m zvLMvVJj1WXNQETpMpEIgBOCL8rD8p7/DMDwGYk2HLj02RY5wKHKFAWoMr7L1y3DrSkoE/x7f5cqR 2rDatRCA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1q4LrH-007IJX-4Q; Wed, 31 May 2023 13:27:23 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 5DBE1300DDC; Wed, 31 May 2023 15:27:21 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id A1643243B8571; Wed, 31 May 2023 15:27:16 +0200 (CEST) Message-ID: <20230531132323.924677086@infradead.org> User-Agent: quilt/0.66 Date: Wed, 31 May 2023 15:08:43 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, sfr@canb.auug.org.au, mpe@ellerman.id.au, James.Bottomley@hansenpartnership.com, deller@gmx.de, linux-parisc@vger.kernel.org Subject: [PATCH 10/12] slub: Replace cmpxchg_double() References: <20230531130833.635651916@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Arnd Bergmann Acked-by: Vlastimil Babka Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- include/linux/slub_def.h | 12 +++- mm/slab.h | 53 +++++++++++++++-- mm/slub.c | 139 +++++++++++++++++++++++++++-------------------- 3 files changed, 137 insertions(+), 67 deletions(-) --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -39,7 +39,8 @@ enum stat_item { CPU_PARTIAL_FREE, /* Refill cpu partial on free */ CPU_PARTIAL_NODE, /* Refill cpu partial from node partial */ CPU_PARTIAL_DRAIN, /* Drain cpu partial to node partial */ - NR_SLUB_STAT_ITEMS }; + NR_SLUB_STAT_ITEMS +}; #ifndef CONFIG_SLUB_TINY /* @@ -47,8 +48,13 @@ enum stat_item { * with this_cpu_cmpxchg_double() alignment requirements. */ struct kmem_cache_cpu { - void **freelist; /* Pointer to next available object */ - unsigned long tid; /* Globally unique transaction id */ + union { + struct { + void **freelist; /* Pointer to next available object */ + unsigned long tid; /* Globally unique transaction id */ + }; + freelist_aba_t freelist_tid; + }; struct slab *slab; /* The slab from which we are allocating */ #ifdef CONFIG_SLUB_CPU_PARTIAL struct slab *partial; /* Partially allocated frozen slabs */ --- a/mm/slab.h +++ b/mm/slab.h @@ -6,6 +6,38 @@ */ void __init kmem_cache_init(void); +#ifdef CONFIG_64BIT +# ifdef system_has_cmpxchg128 +# define system_has_freelist_aba() system_has_cmpxchg128() +# define try_cmpxchg_freelist try_cmpxchg128 +# endif +#define this_cpu_try_cmpxchg_freelist this_cpu_try_cmpxchg128 +typedef u128 freelist_full_t; +#else /* CONFIG_64BIT */ +# ifdef system_has_cmpxchg64 +# define system_has_freelist_aba() system_has_cmpxchg64() +# define try_cmpxchg_freelist try_cmpxchg64 +# endif +#define this_cpu_try_cmpxchg_freelist this_cpu_try_cmpxchg64 +typedef u64 freelist_full_t; +#endif /* CONFIG_64BIT */ + +#if defined(system_has_freelist_aba) && !defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) +#undef system_has_freelist_aba +#endif + +/* + * Freelist pointer and counter to cmpxchg together, avoids the typical ABA + * problems with cmpxchg of just a pointer. + */ +typedef union { + struct { + void *freelist; + unsigned long counter; + }; + freelist_full_t full; +} freelist_aba_t; + /* Reuses the bits in struct page */ struct slab { unsigned long __page_flags; @@ -38,14 +70,21 @@ struct slab { #endif }; /* Double-word boundary */ - void *freelist; /* first free object */ union { - unsigned long counters; struct { - unsigned inuse:16; - unsigned objects:15; - unsigned frozen:1; + void *freelist; /* first free object */ + union { + unsigned long counters; + struct { + unsigned inuse:16; + unsigned objects:15; + unsigned frozen:1; + }; + }; }; +#ifdef system_has_freelist_aba + freelist_aba_t freelist_counter; +#endif }; }; struct rcu_head rcu_head; @@ -72,8 +111,8 @@ SLAB_MATCH(memcg_data, memcg_data); #endif #undef SLAB_MATCH static_assert(sizeof(struct slab) <= sizeof(struct page)); -#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && defined(CONFIG_SLUB) -static_assert(IS_ALIGNED(offsetof(struct slab, freelist), 2*sizeof(void *))); +#if defined(system_has_freelist_aba) && defined(CONFIG_SLUB) +static_assert(IS_ALIGNED(offsetof(struct slab, freelist), sizeof(freelist_aba_t))); #endif /** --- a/mm/slub.c +++ b/mm/slub.c @@ -292,7 +292,12 @@ static inline bool kmem_cache_has_cpu_pa /* Poison object */ #define __OBJECT_POISON ((slab_flags_t __force)0x80000000U) /* Use cmpxchg_double */ + +#ifdef system_has_freelist_aba #define __CMPXCHG_DOUBLE ((slab_flags_t __force)0x40000000U) +#else +#define __CMPXCHG_DOUBLE ((slab_flags_t __force)0U) +#endif /* * Tracking user of a slab. @@ -512,6 +517,40 @@ static __always_inline void slab_unlock( __bit_spin_unlock(PG_locked, &page->flags); } +static inline bool +__update_freelist_fast(struct slab *slab, + void *freelist_old, unsigned long counters_old, + void *freelist_new, unsigned long counters_new) +{ +#ifdef system_has_freelist_aba + freelist_aba_t old = { .freelist = freelist_old, .counter = counters_old }; + freelist_aba_t new = { .freelist = freelist_new, .counter = counters_new }; + + return try_cmpxchg_freelist(&slab->freelist_counter.full, &old.full, new.full); +#else + return false; +#endif +} + +static inline bool +__update_freelist_slow(struct slab *slab, + void *freelist_old, unsigned long counters_old, + void *freelist_new, unsigned long counters_new) +{ + bool ret = false; + + slab_lock(slab); + if (slab->freelist == freelist_old && + slab->counters == counters_old) { + slab->freelist = freelist_new; + slab->counters = counters_new; + ret = true; + } + slab_unlock(slab); + + return ret; +} + /* * Interrupts must be disabled (for the fallback code to work right), typically * by an _irqsave() lock variant. On PREEMPT_RT the preempt_disable(), which is @@ -519,33 +558,25 @@ static __always_inline void slab_unlock( * allocation/ free operation in hardirq context. Therefore nothing can * interrupt the operation. */ -static inline bool __cmpxchg_double_slab(struct kmem_cache *s, struct slab *slab, +static inline bool __slab_update_freelist(struct kmem_cache *s, struct slab *slab, void *freelist_old, unsigned long counters_old, void *freelist_new, unsigned long counters_new, const char *n) { + bool ret; + if (USE_LOCKLESS_FAST_PATH()) lockdep_assert_irqs_disabled(); -#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ - defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) + if (s->flags & __CMPXCHG_DOUBLE) { - if (cmpxchg_double(&slab->freelist, &slab->counters, - freelist_old, counters_old, - freelist_new, counters_new)) - return true; - } else -#endif - { - slab_lock(slab); - if (slab->freelist == freelist_old && - slab->counters == counters_old) { - slab->freelist = freelist_new; - slab->counters = counters_new; - slab_unlock(slab); - return true; - } - slab_unlock(slab); + ret = __update_freelist_fast(slab, freelist_old, counters_old, + freelist_new, counters_new); + } else { + ret = __update_freelist_slow(slab, freelist_old, counters_old, + freelist_new, counters_new); } + if (likely(ret)) + return true; cpu_relax(); stat(s, CMPXCHG_DOUBLE_FAIL); @@ -557,36 +588,26 @@ static inline bool __cmpxchg_double_slab return false; } -static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct slab *slab, +static inline bool slab_update_freelist(struct kmem_cache *s, struct slab *slab, void *freelist_old, unsigned long counters_old, void *freelist_new, unsigned long counters_new, const char *n) { -#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ - defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) + bool ret; + if (s->flags & __CMPXCHG_DOUBLE) { - if (cmpxchg_double(&slab->freelist, &slab->counters, - freelist_old, counters_old, - freelist_new, counters_new)) - return true; - } else -#endif - { + ret = __update_freelist_fast(slab, freelist_old, counters_old, + freelist_new, counters_new); + } else { unsigned long flags; local_irq_save(flags); - slab_lock(slab); - if (slab->freelist == freelist_old && - slab->counters == counters_old) { - slab->freelist = freelist_new; - slab->counters = counters_new; - slab_unlock(slab); - local_irq_restore(flags); - return true; - } - slab_unlock(slab); + ret = __update_freelist_slow(slab, freelist_old, counters_old, + freelist_new, counters_new); local_irq_restore(flags); } + if (likely(ret)) + return true; cpu_relax(); stat(s, CMPXCHG_DOUBLE_FAIL); @@ -2228,7 +2249,7 @@ static inline void *acquire_slab(struct VM_BUG_ON(new.frozen); new.frozen = 1; - if (!__cmpxchg_double_slab(s, slab, + if (!__slab_update_freelist(s, slab, freelist, counters, new.freelist, new.counters, "acquire_slab")) @@ -2554,7 +2575,7 @@ static void deactivate_slab(struct kmem_ } - if (!cmpxchg_double_slab(s, slab, + if (!slab_update_freelist(s, slab, old.freelist, old.counters, new.freelist, new.counters, "unfreezing slab")) { @@ -2611,7 +2632,7 @@ static void __unfreeze_partials(struct k new.frozen = 0; - } while (!__cmpxchg_double_slab(s, slab, + } while (!__slab_update_freelist(s, slab, old.freelist, old.counters, new.freelist, new.counters, "unfreezing slab")); @@ -3008,6 +3029,18 @@ static inline bool pfmemalloc_match(stru } #ifndef CONFIG_SLUB_TINY +static inline bool +__update_cpu_freelist_fast(struct kmem_cache *s, + void *freelist_old, void *freelist_new, + unsigned long tid) +{ + freelist_aba_t old = { .freelist = freelist_old, .counter = tid }; + freelist_aba_t new = { .freelist = freelist_new, .counter = next_tid(tid) }; + + return this_cpu_try_cmpxchg_freelist(s->cpu_slab->freelist_tid.full, + &old.full, new.full); +} + /* * Check the slab->freelist and either transfer the freelist to the * per cpu freelist or deactivate the slab. @@ -3034,7 +3067,7 @@ static inline void *get_freelist(struct new.inuse = slab->objects; new.frozen = freelist != NULL; - } while (!__cmpxchg_double_slab(s, slab, + } while (!__slab_update_freelist(s, slab, freelist, counters, NULL, new.counters, "get_freelist")); @@ -3359,11 +3392,7 @@ static __always_inline void *__slab_allo * against code executing on this cpu *not* from access by * other cpus. */ - if (unlikely(!this_cpu_cmpxchg_double( - s->cpu_slab->freelist, s->cpu_slab->tid, - object, tid, - next_object, next_tid(tid)))) { - + if (unlikely(!__update_cpu_freelist_fast(s, object, next_object, tid))) { note_cmpxchg_failure("slab_alloc", s, tid); goto redo; } @@ -3631,7 +3660,7 @@ static void __slab_free(struct kmem_cach } } - } while (!cmpxchg_double_slab(s, slab, + } while (!slab_update_freelist(s, slab, prior, counters, head, new.counters, "__slab_free")); @@ -3736,11 +3765,7 @@ static __always_inline void do_slab_free set_freepointer(s, tail_obj, freelist); - if (unlikely(!this_cpu_cmpxchg_double( - s->cpu_slab->freelist, s->cpu_slab->tid, - freelist, tid, - head, next_tid(tid)))) { - + if (unlikely(!__update_cpu_freelist_fast(s, freelist, head, tid))) { note_cmpxchg_failure("slab_free", s, tid); goto redo; } @@ -4505,11 +4530,11 @@ static int kmem_cache_open(struct kmem_c } } -#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ - defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) - if (system_has_cmpxchg_double() && (s->flags & SLAB_NO_CMPXCHG) == 0) +#ifdef system_has_freelist_aba + if (system_has_freelist_aba() && !(s->flags & SLAB_NO_CMPXCHG)) { /* Enable fast mode */ s->flags |= __CMPXCHG_DOUBLE; + } #endif /* From patchwork Wed May 31 13:08:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13262309 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B9D0C7EE2E for ; Wed, 31 May 2023 13:28:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236500AbjEaN2i (ORCPT ); Wed, 31 May 2023 09:28:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236356AbjEaN2V (ORCPT ); Wed, 31 May 2023 09:28:21 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 846E2124; Wed, 31 May 2023 06:28:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=L7QX61+pLJRtQvuTIyw+Vs54hrEG/gXHVHa+cgAsd2c=; b=U2eaYIdjGP5k0E56EzbAHYaewU D9nGB9W5y3Zs127NBWHZo1VXPlkq4AXnCr08eXmmd1qszL1M1MOo6GTR88wX9B3kaxticQVTQEB22 XQf88+vUb+SkqmmbJzFzkShkem2AH9Y4p952TVOVwJKkjtkTFJJAW29YWI31yfNLV2pER2WsXb6l5 IOkvK02gTAYNzFKCrRnu64kq3KEQbYFJcUomHmAGT++zvfzG8eQhuNbr4uZQ9/OMdE1lDeVHxkVVt 0QRIpnDeqMtJy7D7bwdOoIVf11NvY3i0zqsblzwhZemw8pIsRHNjp1YT9OxVbFKg8P9BSFLq8Q4hJ rk265QJg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1q4LrH-00FTCg-0T; Wed, 31 May 2023 13:27:23 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 6282B300E25; Wed, 31 May 2023 15:27:21 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id A87B3243B857A; Wed, 31 May 2023 15:27:16 +0200 (CEST) Message-ID: <20230531132323.991907085@infradead.org> User-Agent: quilt/0.66 Date: Wed, 31 May 2023 15:08:44 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, sfr@canb.auug.org.au, mpe@ellerman.id.au, James.Bottomley@hansenpartnership.com, deller@gmx.de, linux-parisc@vger.kernel.org Subject: [PATCH 11/12] arch: Remove cmpxchg_double References: <20230531130833.635651916@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org No moar users, remove the monster. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Arnd Bergmann Acked-by: Heiko Carstens --- Documentation/core-api/this_cpu_ops.rst | 2 - arch/arm64/include/asm/atomic_ll_sc.h | 33 ---------------- arch/arm64/include/asm/atomic_lse.h | 36 ------------------ arch/arm64/include/asm/cmpxchg.h | 46 ----------------------- arch/arm64/include/asm/percpu.h | 10 ----- arch/s390/include/asm/cmpxchg.h | 34 ----------------- arch/s390/include/asm/percpu.h | 18 --------- arch/x86/include/asm/cmpxchg.h | 25 ------------ arch/x86/include/asm/cmpxchg_32.h | 1 arch/x86/include/asm/cmpxchg_64.h | 1 arch/x86/include/asm/percpu.h | 42 --------------------- include/asm-generic/percpu.h | 58 ----------------------------- include/linux/atomic/atomic-instrumented.h | 17 -------- include/linux/percpu-defs.h | 38 ------------------- scripts/atomic/gen-atomic-instrumented.sh | 15 ++----- 15 files changed, 5 insertions(+), 371 deletions(-) --- a/Documentation/core-api/this_cpu_ops.rst +++ b/Documentation/core-api/this_cpu_ops.rst @@ -53,7 +53,6 @@ are defined. These operations can be use this_cpu_add_return(pcp, val) this_cpu_xchg(pcp, nval) this_cpu_cmpxchg(pcp, oval, nval) - this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) this_cpu_sub(pcp, val) this_cpu_inc(pcp) this_cpu_dec(pcp) @@ -242,7 +241,6 @@ modifies the variable, then RMW actions __this_cpu_add_return(pcp, val) __this_cpu_xchg(pcp, nval) __this_cpu_cmpxchg(pcp, oval, nval) - __this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) __this_cpu_sub(pcp, val) __this_cpu_inc(pcp) __this_cpu_dec(pcp) --- a/arch/arm64/include/asm/atomic_ll_sc.h +++ b/arch/arm64/include/asm/atomic_ll_sc.h @@ -294,39 +294,6 @@ __CMPXCHG_CASE( , , mb_, 64, dmb ish, #undef __CMPXCHG_CASE -#define __CMPXCHG_DBL(name, mb, rel, cl) \ -static __always_inline long \ -__ll_sc__cmpxchg_double##name(unsigned long old1, \ - unsigned long old2, \ - unsigned long new1, \ - unsigned long new2, \ - volatile void *ptr) \ -{ \ - unsigned long tmp, ret; \ - \ - asm volatile("// __cmpxchg_double" #name "\n" \ - " prfm pstl1strm, %2\n" \ - "1: ldxp %0, %1, %2\n" \ - " eor %0, %0, %3\n" \ - " eor %1, %1, %4\n" \ - " orr %1, %0, %1\n" \ - " cbnz %1, 2f\n" \ - " st" #rel "xp %w0, %5, %6, %2\n" \ - " cbnz %w0, 1b\n" \ - " " #mb "\n" \ - "2:" \ - : "=&r" (tmp), "=&r" (ret), "+Q" (*(__uint128_t *)ptr) \ - : "r" (old1), "r" (old2), "r" (new1), "r" (new2) \ - : cl); \ - \ - return ret; \ -} - -__CMPXCHG_DBL( , , , ) -__CMPXCHG_DBL(_mb, dmb ish, l, "memory") - -#undef __CMPXCHG_DBL - union __u128_halves { u128 full; struct { --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -281,42 +281,6 @@ __CMPXCHG_CASE(x, , mb_, 64, al, "memo #undef __CMPXCHG_CASE -#define __CMPXCHG_DBL(name, mb, cl...) \ -static __always_inline long \ -__lse__cmpxchg_double##name(unsigned long old1, \ - unsigned long old2, \ - unsigned long new1, \ - unsigned long new2, \ - volatile void *ptr) \ -{ \ - unsigned long oldval1 = old1; \ - unsigned long oldval2 = old2; \ - register unsigned long x0 asm ("x0") = old1; \ - register unsigned long x1 asm ("x1") = old2; \ - register unsigned long x2 asm ("x2") = new1; \ - register unsigned long x3 asm ("x3") = new2; \ - register unsigned long x4 asm ("x4") = (unsigned long)ptr; \ - \ - asm volatile( \ - __LSE_PREAMBLE \ - " casp" #mb "\t%[old1], %[old2], %[new1], %[new2], %[v]\n"\ - " eor %[old1], %[old1], %[oldval1]\n" \ - " eor %[old2], %[old2], %[oldval2]\n" \ - " orr %[old1], %[old1], %[old2]" \ - : [old1] "+&r" (x0), [old2] "+&r" (x1), \ - [v] "+Q" (*(__uint128_t *)ptr) \ - : [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), \ - [oldval1] "r" (oldval1), [oldval2] "r" (oldval2) \ - : cl); \ - \ - return x0; \ -} - -__CMPXCHG_DBL( , ) -__CMPXCHG_DBL(_mb, al, "memory") - -#undef __CMPXCHG_DBL - #define __CMPXCHG128(name, mb, cl...) \ static __always_inline u128 \ __lse__cmpxchg128##name(volatile u128 *ptr, u128 old, u128 new) \ --- a/arch/arm64/include/asm/cmpxchg.h +++ b/arch/arm64/include/asm/cmpxchg.h @@ -130,22 +130,6 @@ __CMPXCHG_CASE(mb_, 64) #undef __CMPXCHG_CASE -#define __CMPXCHG_DBL(name) \ -static inline long __cmpxchg_double##name(unsigned long old1, \ - unsigned long old2, \ - unsigned long new1, \ - unsigned long new2, \ - volatile void *ptr) \ -{ \ - return __lse_ll_sc_body(_cmpxchg_double##name, \ - old1, old2, new1, new2, ptr); \ -} - -__CMPXCHG_DBL( ) -__CMPXCHG_DBL(_mb) - -#undef __CMPXCHG_DBL - #define __CMPXCHG128(name) \ static inline u128 __cmpxchg128##name(volatile u128 *ptr, \ u128 old, u128 new) \ @@ -211,36 +195,6 @@ __CMPXCHG_GEN(_mb) #define arch_cmpxchg64 arch_cmpxchg #define arch_cmpxchg64_local arch_cmpxchg_local -/* cmpxchg_double */ -#define system_has_cmpxchg_double() 1 - -#define __cmpxchg_double_check(ptr1, ptr2) \ -({ \ - if (sizeof(*(ptr1)) != 8) \ - BUILD_BUG(); \ - VM_BUG_ON((unsigned long *)(ptr2) - (unsigned long *)(ptr1) != 1); \ -}) - -#define arch_cmpxchg_double(ptr1, ptr2, o1, o2, n1, n2) \ -({ \ - int __ret; \ - __cmpxchg_double_check(ptr1, ptr2); \ - __ret = !__cmpxchg_double_mb((unsigned long)(o1), (unsigned long)(o2), \ - (unsigned long)(n1), (unsigned long)(n2), \ - ptr1); \ - __ret; \ -}) - -#define arch_cmpxchg_double_local(ptr1, ptr2, o1, o2, n1, n2) \ -({ \ - int __ret; \ - __cmpxchg_double_check(ptr1, ptr2); \ - __ret = !__cmpxchg_double((unsigned long)(o1), (unsigned long)(o2), \ - (unsigned long)(n1), (unsigned long)(n2), \ - ptr1); \ - __ret; \ -}) - /* cmpxchg128 */ #define system_has_cmpxchg128() 1 --- a/arch/arm64/include/asm/percpu.h +++ b/arch/arm64/include/asm/percpu.h @@ -145,16 +145,6 @@ PERCPU_RET_OP(add, add, ldadd) * preemption point when TIF_NEED_RESCHED gets set while preemption is * disabled. */ -#define this_cpu_cmpxchg_double_8(ptr1, ptr2, o1, o2, n1, n2) \ -({ \ - int __ret; \ - preempt_disable_notrace(); \ - __ret = cmpxchg_double_local( raw_cpu_ptr(&(ptr1)), \ - raw_cpu_ptr(&(ptr2)), \ - o1, o2, n1, n2); \ - preempt_enable_notrace(); \ - __ret; \ -}) #define _pcp_protect(op, pcp, ...) \ ({ \ --- a/arch/s390/include/asm/cmpxchg.h +++ b/arch/s390/include/asm/cmpxchg.h @@ -190,40 +190,6 @@ static __always_inline unsigned long __c #define arch_cmpxchg_local arch_cmpxchg #define arch_cmpxchg64_local arch_cmpxchg -#define system_has_cmpxchg_double() 1 - -static __always_inline int __cmpxchg_double(unsigned long p1, unsigned long p2, - unsigned long o1, unsigned long o2, - unsigned long n1, unsigned long n2) -{ - union register_pair old = { .even = o1, .odd = o2, }; - union register_pair new = { .even = n1, .odd = n2, }; - int cc; - - asm volatile( - " cdsg %[old],%[new],%[ptr]\n" - " ipm %[cc]\n" - " srl %[cc],28\n" - : [cc] "=&d" (cc), [old] "+&d" (old.pair) - : [new] "d" (new.pair), - [ptr] "QS" (*(unsigned long *)p1), "Q" (*(unsigned long *)p2) - : "memory", "cc"); - return !cc; -} - -#define arch_cmpxchg_double(p1, p2, o1, o2, n1, n2) \ -({ \ - typeof(p1) __p1 = (p1); \ - typeof(p2) __p2 = (p2); \ - \ - BUILD_BUG_ON(sizeof(*(p1)) != sizeof(long)); \ - BUILD_BUG_ON(sizeof(*(p2)) != sizeof(long)); \ - VM_BUG_ON((unsigned long)((__p1) + 1) != (unsigned long)(__p2));\ - __cmpxchg_double((unsigned long)__p1, (unsigned long)__p2, \ - (unsigned long)(o1), (unsigned long)(o2), \ - (unsigned long)(n1), (unsigned long)(n2)); \ -}) - #define system_has_cmpxchg128() 1 static __always_inline u128 arch_cmpxchg128(volatile u128 *ptr, u128 old, u128 new) --- a/arch/s390/include/asm/percpu.h +++ b/arch/s390/include/asm/percpu.h @@ -180,24 +180,6 @@ #define this_cpu_xchg_4(pcp, nval) arch_this_cpu_xchg(pcp, nval) #define this_cpu_xchg_8(pcp, nval) arch_this_cpu_xchg(pcp, nval) -#define arch_this_cpu_cmpxchg_double(pcp1, pcp2, o1, o2, n1, n2) \ -({ \ - typeof(pcp1) *p1__; \ - typeof(pcp2) *p2__; \ - int ret__; \ - \ - preempt_disable_notrace(); \ - p1__ = raw_cpu_ptr(&(pcp1)); \ - p2__ = raw_cpu_ptr(&(pcp2)); \ - ret__ = __cmpxchg_double((unsigned long)p1__, (unsigned long)p2__, \ - (unsigned long)(o1), (unsigned long)(o2), \ - (unsigned long)(n1), (unsigned long)(n2)); \ - preempt_enable_notrace(); \ - ret__; \ -}) - -#define this_cpu_cmpxchg_double_8 arch_this_cpu_cmpxchg_double - #include #endif /* __ARCH_S390_PERCPU__ */ --- a/arch/x86/include/asm/cmpxchg.h +++ b/arch/x86/include/asm/cmpxchg.h @@ -239,29 +239,4 @@ extern void __add_wrong_size(void) #define __xadd(ptr, inc, lock) __xchg_op((ptr), (inc), xadd, lock) #define xadd(ptr, inc) __xadd((ptr), (inc), LOCK_PREFIX) -#define __cmpxchg_double(pfx, p1, p2, o1, o2, n1, n2) \ -({ \ - bool __ret; \ - __typeof__(*(p1)) __old1 = (o1), __new1 = (n1); \ - __typeof__(*(p2)) __old2 = (o2), __new2 = (n2); \ - BUILD_BUG_ON(sizeof(*(p1)) != sizeof(long)); \ - BUILD_BUG_ON(sizeof(*(p2)) != sizeof(long)); \ - VM_BUG_ON((unsigned long)(p1) % (2 * sizeof(long))); \ - VM_BUG_ON((unsigned long)((p1) + 1) != (unsigned long)(p2)); \ - asm volatile(pfx "cmpxchg%c5b %1" \ - CC_SET(e) \ - : CC_OUT(e) (__ret), \ - "+m" (*(p1)), "+m" (*(p2)), \ - "+a" (__old1), "+d" (__old2) \ - : "i" (2 * sizeof(long)), \ - "b" (__new1), "c" (__new2)); \ - __ret; \ -}) - -#define arch_cmpxchg_double(p1, p2, o1, o2, n1, n2) \ - __cmpxchg_double(LOCK_PREFIX, p1, p2, o1, o2, n1, n2) - -#define arch_cmpxchg_double_local(p1, p2, o1, o2, n1, n2) \ - __cmpxchg_double(, p1, p2, o1, o2, n1, n2) - #endif /* ASM_X86_CMPXCHG_H */ --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -103,7 +103,6 @@ static inline bool __try_cmpxchg64(volat #endif -#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX8) #define system_has_cmpxchg64() boot_cpu_has(X86_FEATURE_CX8) #endif /* _ASM_X86_CMPXCHG_32_H */ --- a/arch/x86/include/asm/cmpxchg_64.h +++ b/arch/x86/include/asm/cmpxchg_64.h @@ -81,7 +81,6 @@ static __always_inline bool arch_try_cmp return __arch_try_cmpxchg128(ptr, oldp, new,); } -#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX16) #define system_has_cmpxchg128() boot_cpu_has(X86_FEATURE_CX16) #endif /* _ASM_X86_CMPXCHG_64_H */ --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -351,23 +351,6 @@ do { \ #define this_cpu_cmpxchg_2(pcp, oval, nval) percpu_cmpxchg_op(2, volatile, pcp, oval, nval) #define this_cpu_cmpxchg_4(pcp, oval, nval) percpu_cmpxchg_op(4, volatile, pcp, oval, nval) -#ifdef CONFIG_X86_CMPXCHG64 -#define percpu_cmpxchg8b_double(pcp1, pcp2, o1, o2, n1, n2) \ -({ \ - bool __ret; \ - typeof(pcp1) __o1 = (o1), __n1 = (n1); \ - typeof(pcp2) __o2 = (o2), __n2 = (n2); \ - asm volatile("cmpxchg8b "__percpu_arg(1) \ - CC_SET(z) \ - : CC_OUT(z) (__ret), "+m" (pcp1), "+m" (pcp2), "+a" (__o1), "+d" (__o2) \ - : "b" (__n1), "c" (__n2)); \ - __ret; \ -}) - -#define raw_cpu_cmpxchg_double_4 percpu_cmpxchg8b_double -#define this_cpu_cmpxchg_double_4 percpu_cmpxchg8b_double -#endif /* CONFIG_X86_CMPXCHG64 */ - /* * Per cpu atomic 64 bit operations are only available under 64 bit. * 32 bit must fall back to generic operations. @@ -390,31 +373,6 @@ do { \ #define this_cpu_add_return_8(pcp, val) percpu_add_return_op(8, volatile, pcp, val) #define this_cpu_xchg_8(pcp, nval) percpu_xchg_op(8, volatile, pcp, nval) #define this_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg_op(8, volatile, pcp, oval, nval) - -/* - * Pretty complex macro to generate cmpxchg16 instruction. The instruction - * is not supported on early AMD64 processors so we must be able to emulate - * it in software. The address used in the cmpxchg16 instruction must be - * aligned to a 16 byte boundary. - */ -#define percpu_cmpxchg16b_double(pcp1, pcp2, o1, o2, n1, n2) \ -({ \ - bool __ret; \ - typeof(pcp1) __o1 = (o1), __n1 = (n1); \ - typeof(pcp2) __o2 = (o2), __n2 = (n2); \ - asm volatile (ALTERNATIVE("leaq %P1, %%rsi; call this_cpu_cmpxchg16b_emu", \ - "cmpxchg16b " __percpu_arg(1), X86_FEATURE_CX16) \ - "setz %0" \ - : "=a" (__ret), "+m" (pcp1) \ - : "b" (__n1), "c" (__n2), \ - "a" (__o1), "d" (__o2) \ - : "memory", "rsi"); \ - __ret; \ -}) - -#define raw_cpu_cmpxchg_double_8 percpu_cmpxchg16b_double -#define this_cpu_cmpxchg_double_8 percpu_cmpxchg16b_double - #endif static __always_inline bool x86_this_cpu_constant_test_bit(unsigned int nr, --- a/include/asm-generic/percpu.h +++ b/include/asm-generic/percpu.h @@ -120,19 +120,6 @@ do { \ __old; \ }) -#define raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ -({ \ - typeof(pcp1) *__p1 = raw_cpu_ptr(&(pcp1)); \ - typeof(pcp2) *__p2 = raw_cpu_ptr(&(pcp2)); \ - int __ret = 0; \ - if (*__p1 == (oval1) && *__p2 == (oval2)) { \ - *__p1 = nval1; \ - *__p2 = nval2; \ - __ret = 1; \ - } \ - (__ret); \ -}) - #define __this_cpu_generic_read_nopreempt(pcp) \ ({ \ typeof(pcp) ___ret; \ @@ -211,17 +198,6 @@ do { \ __ret; \ }) -#define this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ -({ \ - int __ret; \ - unsigned long __flags; \ - raw_local_irq_save(__flags); \ - __ret = raw_cpu_generic_cmpxchg_double(pcp1, pcp2, \ - oval1, oval2, nval1, nval2); \ - raw_local_irq_restore(__flags); \ - __ret; \ -}) - #ifndef raw_cpu_read_1 #define raw_cpu_read_1(pcp) raw_cpu_generic_read(pcp) #endif @@ -424,23 +400,6 @@ do { \ raw_cpu_generic_cmpxchg(pcp, oval, nval) #endif -#ifndef raw_cpu_cmpxchg_double_1 -#define raw_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef raw_cpu_cmpxchg_double_2 -#define raw_cpu_cmpxchg_double_2(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef raw_cpu_cmpxchg_double_4 -#define raw_cpu_cmpxchg_double_4(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef raw_cpu_cmpxchg_double_8 -#define raw_cpu_cmpxchg_double_8(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif - #ifndef this_cpu_read_1 #define this_cpu_read_1(pcp) this_cpu_generic_read(pcp) #endif @@ -641,21 +600,4 @@ do { \ this_cpu_generic_cmpxchg(pcp, oval, nval) #endif -#ifndef this_cpu_cmpxchg_double_1 -#define this_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef this_cpu_cmpxchg_double_2 -#define this_cpu_cmpxchg_double_2(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef this_cpu_cmpxchg_double_4 -#define this_cpu_cmpxchg_double_4(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef this_cpu_cmpxchg_double_8 -#define this_cpu_cmpxchg_double_8(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif - #endif /* _ASM_GENERIC_PERCPU_H_ */ --- a/include/linux/atomic/atomic-instrumented.h +++ b/include/linux/atomic/atomic-instrumented.h @@ -2234,21 +2234,6 @@ atomic_long_dec_if_positive(atomic_long_ arch_try_cmpxchg128_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \ }) -#define cmpxchg_double(ptr, ...) \ -({ \ - typeof(ptr) __ai_ptr = (ptr); \ - kcsan_mb(); \ - instrument_atomic_read_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \ - arch_cmpxchg_double(__ai_ptr, __VA_ARGS__); \ -}) - - -#define cmpxchg_double_local(ptr, ...) \ -({ \ - typeof(ptr) __ai_ptr = (ptr); \ - instrument_atomic_read_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \ - arch_cmpxchg_double_local(__ai_ptr, __VA_ARGS__); \ -}) #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */ -// 82d1be694fab30414527d0877c29fa75ed5a0b74 +// 3611991b015450e119bcd7417a9431af7f3ba13c --- a/include/linux/percpu-defs.h +++ b/include/linux/percpu-defs.h @@ -358,33 +358,6 @@ static __always_inline void __this_cpu_p pscr2_ret__; \ }) -/* - * Special handling for cmpxchg_double. cmpxchg_double is passed two - * percpu variables. The first has to be aligned to a double word - * boundary and the second has to follow directly thereafter. - * We enforce this on all architectures even if they don't support - * a double cmpxchg instruction, since it's a cheap requirement, and it - * avoids breaking the requirement for architectures with the instruction. - */ -#define __pcpu_double_call_return_bool(stem, pcp1, pcp2, ...) \ -({ \ - bool pdcrb_ret__; \ - __verify_pcpu_ptr(&(pcp1)); \ - BUILD_BUG_ON(sizeof(pcp1) != sizeof(pcp2)); \ - VM_BUG_ON((unsigned long)(&(pcp1)) % (2 * sizeof(pcp1))); \ - VM_BUG_ON((unsigned long)(&(pcp2)) != \ - (unsigned long)(&(pcp1)) + sizeof(pcp1)); \ - switch(sizeof(pcp1)) { \ - case 1: pdcrb_ret__ = stem##1(pcp1, pcp2, __VA_ARGS__); break; \ - case 2: pdcrb_ret__ = stem##2(pcp1, pcp2, __VA_ARGS__); break; \ - case 4: pdcrb_ret__ = stem##4(pcp1, pcp2, __VA_ARGS__); break; \ - case 8: pdcrb_ret__ = stem##8(pcp1, pcp2, __VA_ARGS__); break; \ - default: \ - __bad_size_call_parameter(); break; \ - } \ - pdcrb_ret__; \ -}) - #define __pcpu_size_call(stem, variable, ...) \ do { \ __verify_pcpu_ptr(&(variable)); \ @@ -443,9 +416,6 @@ do { \ __pcpu_size_call_return2(raw_cpu_cmpxchg_, pcp, oval, nval) #define raw_cpu_try_cmpxchg(pcp, ovalp, nval) \ __pcpu_size_call_return2bool(raw_cpu_try_cmpxchg_, pcp, ovalp, nval) -#define raw_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - __pcpu_double_call_return_bool(raw_cpu_cmpxchg_double_, pcp1, pcp2, oval1, oval2, nval1, nval2) - #define raw_cpu_sub(pcp, val) raw_cpu_add(pcp, -(val)) #define raw_cpu_inc(pcp) raw_cpu_add(pcp, 1) #define raw_cpu_dec(pcp) raw_cpu_sub(pcp, 1) @@ -505,11 +475,6 @@ do { \ raw_cpu_cmpxchg(pcp, oval, nval); \ }) -#define __this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ -({ __this_cpu_preempt_check("cmpxchg_double"); \ - raw_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2); \ -}) - #define __this_cpu_sub(pcp, val) __this_cpu_add(pcp, -(typeof(pcp))(val)) #define __this_cpu_inc(pcp) __this_cpu_add(pcp, 1) #define __this_cpu_dec(pcp) __this_cpu_sub(pcp, 1) @@ -532,9 +497,6 @@ do { \ __pcpu_size_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) #define this_cpu_try_cmpxchg(pcp, ovalp, nval) \ __pcpu_size_call_return2bool(this_cpu_try_cmpxchg_, pcp, ovalp, nval) -#define this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - __pcpu_double_call_return_bool(this_cpu_cmpxchg_double_, pcp1, pcp2, oval1, oval2, nval1, nval2) - #define this_cpu_sub(pcp, val) this_cpu_add(pcp, -(typeof(pcp))(val)) #define this_cpu_inc(pcp) this_cpu_add(pcp, 1) #define this_cpu_dec(pcp) this_cpu_sub(pcp, 1) --- a/scripts/atomic/gen-atomic-instrumented.sh +++ b/scripts/atomic/gen-atomic-instrumented.sh @@ -84,7 +84,6 @@ gen_xchg() { local xchg="$1"; shift local order="$1"; shift - local mult="$1"; shift kcsan_barrier="" if [ "${xchg%_local}" = "${xchg}" ]; then @@ -104,8 +103,8 @@ cat < X-Patchwork-Id: 13262302 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D455EC7EE39 for ; Wed, 31 May 2023 13:28:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236422AbjEaN2Z (ORCPT ); Wed, 31 May 2023 09:28:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236314AbjEaN2V (ORCPT ); Wed, 31 May 2023 09:28:21 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5DA6812B; Wed, 31 May 2023 06:28:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=1cvnhq64QQSvdQ/ThzyQEhZhPISdvBfB1iYqdfSw3+U=; b=dEAiPyJ7c8GWvTXHawa9FYMa5y RcTU2/dDPs5cY4YnB6+unHRRRJkJTPocT+l9fVBL1T5GVSYwKO0qeN8JqCcETx/Inr4/AN1sfHxRe b1b7BfbYcIcDYgL13wupMtVXdr7AJtNnY68GfgKL3U7VxymWSM1OiXmYlu2qvpGzKmGMenGmWnE6R C89aJ+e5ksUXHGUNZjP1BUpQNpIOJc4I4d3iDnPyyPZC/izy9S6ksKb+nDOSLa1BGtKCDchQIM0ta uoR3tky7q07ya5MwRnFIgl2eUqTdR+xf/+DHkFrE2NSPNjZStwAPAedEsWQ/7XfmOpjc+hjtQrfuA UncWUN3Q==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1q4LrH-00FTCf-0R; Wed, 31 May 2023 13:27:23 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 63214300E3F; Wed, 31 May 2023 15:27:21 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id AB266243B8573; Wed, 31 May 2023 15:27:16 +0200 (CEST) Message-ID: <20230531132324.058821078@infradead.org> User-Agent: quilt/0.66 Date: Wed, 31 May 2023 15:08:45 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, sfr@canb.auug.org.au, mpe@ellerman.id.au, James.Bottomley@hansenpartnership.com, deller@gmx.de, linux-parisc@vger.kernel.org Subject: [PATCH 12/12] s390/cpum_sf: Convert to cmpxchg128() References: <20230531130833.635651916@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org Now that there is a cross arch u128 and cmpxchg128(), use those instead of the custom CDSG helper. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Arnd Bergmann Acked-by: Heiko Carstens --- arch/s390/include/asm/cpu_mf.h | 2 +- arch/s390/kernel/perf_cpum_sf.c | 16 +++------------- 2 files changed, 4 insertions(+), 14 deletions(-) --- a/arch/s390/include/asm/cpu_mf.h +++ b/arch/s390/include/asm/cpu_mf.h @@ -140,7 +140,7 @@ union hws_trailer_header { unsigned int dsdes:16; /* 48-63: size of diagnostic SDE */ unsigned long long overflow; /* 64 - Overflow Count */ }; - __uint128_t val; + u128 val; }; struct hws_trailer_entry { --- a/arch/s390/kernel/perf_cpum_sf.c +++ b/arch/s390/kernel/perf_cpum_sf.c @@ -1271,16 +1271,6 @@ static void hw_collect_samples(struct pe } } -static inline __uint128_t __cdsg(__uint128_t *ptr, __uint128_t old, __uint128_t new) -{ - asm volatile( - " cdsg %[old],%[new],%[ptr]\n" - : [old] "+d" (old), [ptr] "+QS" (*ptr) - : [new] "d" (new) - : "memory", "cc"); - return old; -} - /* hw_perf_event_update() - Process sampling buffer * @event: The perf event * @flush_all: Flag to also flush partially filled sample-data-blocks @@ -1352,7 +1342,7 @@ static void hw_perf_event_update(struct new.f = 0; new.a = 1; new.overflow = 0; - prev.val = __cdsg(&te->header.val, old.val, new.val); + prev.val = cmpxchg128(&te->header.val, old.val, new.val); } while (prev.val != old.val); /* Advance to next sample-data-block */ @@ -1562,7 +1552,7 @@ static bool aux_set_alert(struct aux_buf } new.a = 1; new.overflow = 0; - prev.val = __cdsg(&te->header.val, old.val, new.val); + prev.val = cmpxchg128(&te->header.val, old.val, new.val); } while (prev.val != old.val); return true; } @@ -1636,7 +1626,7 @@ static bool aux_reset_buffer(struct aux_ new.a = 1; else new.a = 0; - prev.val = __cdsg(&te->header.val, old.val, new.val); + prev.val = cmpxchg128(&te->header.val, old.val, new.val); } while (prev.val != old.val); *overflow += orig_overflow; }