From patchwork Fri Jan 20 14:09:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yann Sionneau X-Patchwork-Id: 13109927 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96229C05027 for ; Fri, 20 Jan 2023 14:10:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C1FFF6B008A; Fri, 20 Jan 2023 09:10:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A9C198E0002; Fri, 20 Jan 2023 09:10:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 71A036B0092; Fri, 20 Jan 2023 09:10:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 36F566B0089 for ; Fri, 20 Jan 2023 09:10:37 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id DDDC01C66C5 for ; Fri, 20 Jan 2023 14:10:36 +0000 (UTC) X-FDA: 80375362872.20.2A68AA8 Received: from fx405.security-mail.net (smtpout140.security-mail.net [85.31.212.145]) by imf02.hostedemail.com (Postfix) with ESMTP id 185E680021 for ; Fri, 20 Jan 2023 14:10:33 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=kalray.eu header.s=sec-sig-email header.b=a2gvWIDY; dkim=fail ("body hash did not verify") header.d=kalray.eu header.s=32AE1B44-9502-11E5-BA35-3734643DEF29 header.b=Itjo5rlk; spf=pass (imf02.hostedemail.com: domain of ysionneau@kalray.eu designates 85.31.212.145 as permitted sender) smtp.mailfrom=ysionneau@kalray.eu; dmarc=pass (policy=quarantine) header.from=kalray.eu ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674223834; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aFsUiozYh/f53qyOHSrJezqqpzHLBd7nykiqkDhgRMs=; b=wt/M59RKVFZd8eF6agKxrfaj1KM3CdMW1iiClNqayn5ubRTuPWFqCZGtz0Rqpg5KAiigUb 41XTOh6krzUuuG3Yyu0vy5kzRTHr40IOZFr4Za204sdRzUTDxx3fpFwvreQBWhPtai6LAl wHB4aOlq+fIxvv+EP5hEsyVUYTdYiUU= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=kalray.eu header.s=sec-sig-email header.b=a2gvWIDY; dkim=fail ("body hash did not verify") header.d=kalray.eu header.s=32AE1B44-9502-11E5-BA35-3734643DEF29 header.b=Itjo5rlk; spf=pass (imf02.hostedemail.com: domain of ysionneau@kalray.eu designates 85.31.212.145 as permitted sender) smtp.mailfrom=ysionneau@kalray.eu; dmarc=pass (policy=quarantine) header.from=kalray.eu ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674223834; a=rsa-sha256; cv=none; b=Fj5D7Z//OuFFBmPOv3+Z1IJy2mIbLr0ZwnNwIPj6KCP1IHAsBPJsv4OBx0gKA7yR8sn8vJ X4fx50+s61bxYKHbNgHLJMOy4nu6rqyu90rSeZg2jwB61ngxnUvFuvm/XLpmoLJp+IVGm/ 5qS5BFaDy9FQTZumsCJFrtHqX3OpE7k= Received: from localhost (fx405.security-mail.net [127.0.0.1]) by fx405.security-mail.net (Postfix) with ESMTP id 8C9DB335DB9 for ; Fri, 20 Jan 2023 15:10:32 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kalray.eu; s=sec-sig-email; t=1674223832; bh=Rr9QfC52jxWXt9/L6/0rXr58eBV1kbp/PCidEQ6niXo=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=a2gvWIDYicwURek2prNsBHdw/qrKpGsVqft9DPlPFrgjW2HCGVFrGzmP4Gz6op5xz Nko1t3KK5BB/nraj3gB03N9S37TcvUpjJxkHcoZYU1QO9inUkrBkXM+uQLE1rIifYt l1HfenLWwY7FwN+G0mYN8aw90/b3ggrEQNcegqj8= Received: from fx405 (fx405.security-mail.net [127.0.0.1]) by fx405.security-mail.net (Postfix) with ESMTP id EFA46335D3D; Fri, 20 Jan 2023 15:10:31 +0100 (CET) Received: from zimbra2.kalray.eu (unknown [217.181.231.53]) by fx405.security-mail.net (Postfix) with ESMTPS id 6DC50335CDD; Fri, 20 Jan 2023 15:10:30 +0100 (CET) Received: from zimbra2.kalray.eu (localhost [127.0.0.1]) by zimbra2.kalray.eu (Postfix) with ESMTPS id 1AE9627E0430; Fri, 20 Jan 2023 15:10:30 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by zimbra2.kalray.eu (Postfix) with ESMTP id CD29027E0440; Fri, 20 Jan 2023 15:10:29 +0100 (CET) Received: from zimbra2.kalray.eu ([127.0.0.1]) by localhost (zimbra2.kalray.eu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id tD1P4gm_bTWU; Fri, 20 Jan 2023 15:10:29 +0100 (CET) Received: from junon.lin.mbt.kalray.eu (unknown [192.168.37.161]) by zimbra2.kalray.eu (Postfix) with ESMTPSA id 3038927E0439; Fri, 20 Jan 2023 15:10:29 +0100 (CET) X-Virus-Scanned: E-securemail Secumail-id: <12804.63caa0d6.69b4c.0> DKIM-Filter: OpenDKIM Filter v2.10.3 zimbra2.kalray.eu CD29027E0440 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kalray.eu; s=32AE1B44-9502-11E5-BA35-3734643DEF29; t=1674223830; bh=VF1oksnoaKlh4VeYXhJaqi0lsOke2F/Yys7H8lI9zVo=; h=From:To:Date:Message-Id:MIME-Version; b=Itjo5rlk0c6fduVn2FOVo3BXyZLWUDckOOT6uwFodpQj7U0oJmO65fjlvfQGS41d3 khT7KRQrylOCi/P+OxHfaW8K2J0uY7G5q0OJZDyh80OfnefJcna94/5Q2cikkTmocZ pcCA51GfgwR3/8hgXA6aedONs7oPIOXzBujD09Fg= From: Yann Sionneau To: Arnd Bergmann , Jonathan Corbet , Thomas Gleixner , Marc Zyngier , Rob Herring , Krzysztof Kozlowski , Will Deacon , Peter Zijlstra , Boqun Feng , Mark Rutland , Eric Biederman , Kees Cook , Oleg Nesterov , Ingo Molnar , Waiman Long , "Aneesh Kumar K.V" , Andrew Morton , Nick Piggin , Paul Moore , Eric Paris , Christian Brauner , Paul Walmsley , Palmer Dabbelt , Albert Ou , Jules Maselbas , Yann Sionneau , Guillaume Thouvenin , Clement Leger , Vincent Chardon , Marc =?utf-8?b?UG91bGhp?= =?utf-8?b?w6hz?= , Julian Vetter , Samuel Jones , Ashley Lesdalons , Thomas Costis , Marius Gligor , Jonathan Borne , Julien Villette , Luc Michel , Louis Morhet , Julien Hascoet , Jean-Christophe Pince , Guillaume Missonnier , Alex Michon , Huacai Chen , WANG Xuerui , Shaokun Zhang , John Garry , Guangbin Huang , Bharat Bhushan , Bibo Mao , Atish Patra , "Jason A. Donenfeld" , Qi Liu , Jiaxun Yang , Catalin Marinas , Mark Brown , Janosch Frank , Alexey Dobriyan Cc: Benjamin Mugnier , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-audit@redhat.com, linux-riscv@lists.infradead.org, bpf@vger.kernel.org Subject: [RFC PATCH v2 11/31] kvx: Add atomic/locking headers Date: Fri, 20 Jan 2023 15:09:42 +0100 Message-ID: <20230120141002.2442-12-ysionneau@kalray.eu> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20230120141002.2442-1-ysionneau@kalray.eu> References: <20230120141002.2442-1-ysionneau@kalray.eu> MIME-Version: 1.0 X-ALTERMIMEV2_out: done X-Stat-Signature: pzhpws6dd1xz1g1a6g6pnaru9ymbmxz5 X-Rspam-User: X-Rspamd-Queue-Id: 185E680021 X-Rspamd-Server: rspam06 X-HE-Tag: 1674223833-510121 X-HE-Meta: U2FsdGVkX18Ca5VodclXfXJxtEkrj9QGik24cTTQftFInsFlZZj9bVTEpPOwrlOnwfWG/sibqkYEi5VaNNVi4ctaQalioqEQtEDwnt2G1tEM4qXTp2hBR+sXtzLdk5lhOFB4GJCVyiF7cLpNsh12nRL569Dyt7PyGPdk+oosJR32FbWG4dNgiQ3hFwd3v+Es8or2f3VOc4myfZCToyar6tR8HcgUwpsKdJs4wQ2u3TP1meC6X+vi4AHB7r2RILZLEQxbdhlOpyWc7T+az38yleuX9hO1ncI5lwVutTP5RdGHJpfL54HNKwRncdXPlBh6JsSj1siffPGeMgMIxWmHjQA+M7PAsn8nlIP9RjP22DRZEMjBZZMfxaeqRPwSmSeSEYRdOgjLYXnIBXsJKSWpUQPujNoQ25e4vzIwv6EGo0/t5P22cpQokqlceZDQ6/yEfVgLErS0JZSWrXayT+tJzWznu1mnaW0525gNGTvvrzGCuyILSkcCT8pRveUh3zk1EUq7KkUL5NHmTml7r2xb3bxsMaqOAdCyP3SjL38xAlsHt1D+bxwi+7D7JBshm8VAuURy/QfUjqJFzu+wGsx/azXedtmEqKsXg9AswYDYYhhVcXbPlFGLRzcKK28CyAzYBRRoRay52AAg4IuK0ffgswfBE0djmap3nrXDBh2gWBjT+Hm/cfKij1d0SsncBZRLMHncZc5zSyoVGJvwS7zFpkIoTH1sxXFRrUWrfd/WNB6Nmsql2dXHJR0NZ/iX23skA8UX+SgkLHvsNIx6pVh+VTN6btOOdYEwfodN+gHjh+jH3uYNQxckmkvTuFbtW6icTcIxIRo2R+RKd2QaMDnQGS2mBDfs3p0yEfl9RllgtmIGsKZwxGMhgAB1JrQnhc5nZO5bM2NkM1nQo0gtbdEULxd2OVO5IYrUeDit6D/cknHe40lQSXhUdPruS9AJqyYQW02aoeM+AobR4wlQ88Z NK9M3f/N 7qnaWrjvsQj7doGwr7jrPg2aFKHhsxovTdN7BiDTzLgVI/HF4FHErWxOrrFMoaU5dc4arnrs/l8x5sr/Q4TMI0KKsJxwHbKH+7Jl4lRRISDH6yqNVrh4gAbSn+v9QRr8wPIGIPPZLCJfPLiKJJfjWLHOF0KZA6iDQTaTSsAY7AcAYR32kzQ/+YDOJ2n8yPjV2WAN6kNYcgWixiOpN4DLNoHKlbIFofJtdaAfSjm303aemHH+kbrBjJDP3IO88bZOEyO0YrXtcDAcB5B4dB+j4C8S+7K8HsWHvLjX9cbItPjBv8ISHG8bFH1DkzD1mMXj/Du6tFBQbkREH2DsHMZltBlD6P4yAtP6JK54xM7rDNK/pWU+ngX/iaBtz0ei0VhY58XZZxCRYTPKTQfMcoPQWnEx1wPl97F+cwf2Dbx9N2PfCmXGwdPsu1oiUERvAKvVR8K3rozJqSWvnIKd/JrF+IU1WJvYVvt8LdXTr9Pa2rlQUOVytBWPZQSGkwWQTQjTEPrpV1OG449XbctuwYEzWQfwLBdqnF+kmWlmDMFsfZ0zNZJHngOAJl7Jqv3Kgfd4t2isf+9drmAl5TbTG5ljcpNKwSrcvaZvLnVoR1kZjPbM9Im7dHIjXhfhJo9nfTtQ01B3CS8FFWfb/nsZjILqmgOoMRr3yv0u1zoORu3Qj4ATYtqAh+KYOQkaTKXeuDC1D4hss X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add common headers (atomic, bitops, barrier and locking) for basic kvx support. Co-developed-by: Clement Leger Signed-off-by: Clement Leger Co-developed-by: Jules Maselbas Signed-off-by: Jules Maselbas Co-developed-by: Julian Vetter Signed-off-by: Julian Vetter Co-developed-by: Julien Villette Signed-off-by: Julien Villette Co-developed-by: Yann Sionneau Signed-off-by: Yann Sionneau --- Notes: V1 -> V2: - use {READ,WRITE}_ONCE for arch_atomic64_{read,set} - use asm-generic/bitops/atomic.h instead of __test_and_*_bit - removed duplicated includes - rewrite xchg and cmpxchg in C using builtins for acswap insn arch/kvx/include/asm/atomic.h | 104 ++++++++++++++++++++ arch/kvx/include/asm/barrier.h | 15 +++ arch/kvx/include/asm/bitops.h | 115 ++++++++++++++++++++++ arch/kvx/include/asm/bitrev.h | 32 +++++++ arch/kvx/include/asm/cmpxchg.h | 170 +++++++++++++++++++++++++++++++++ 5 files changed, 436 insertions(+) create mode 100644 arch/kvx/include/asm/atomic.h create mode 100644 arch/kvx/include/asm/barrier.h create mode 100644 arch/kvx/include/asm/bitops.h create mode 100644 arch/kvx/include/asm/bitrev.h create mode 100644 arch/kvx/include/asm/cmpxchg.h diff --git a/arch/kvx/include/asm/atomic.h b/arch/kvx/include/asm/atomic.h new file mode 100644 index 000000000000..bea3d70785b1 --- /dev/null +++ b/arch/kvx/include/asm/atomic.h @@ -0,0 +1,104 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2017-2023 Kalray Inc. + * Author(s): Clement Leger + */ + +#ifndef _ASM_KVX_ATOMIC_H +#define _ASM_KVX_ATOMIC_H + +#include + +#include + +#define ATOMIC64_INIT(i) { (i) } + +#define arch_atomic64_cmpxchg(v, old, new) (arch_cmpxchg(&((v)->counter), old, new)) +#define arch_atomic64_xchg(v, new) (arch_xchg(&((v)->counter), new)) + +static inline long arch_atomic64_read(const atomic64_t *v) +{ + return READ_ONCE(v->counter); +} + +static inline void arch_atomic64_set(atomic64_t *v, long i) +{ + WRITE_ONCE(v->counter, i); +} + +#define ATOMIC64_RETURN_OP(op, c_op) \ +static inline long arch_atomic64_##op##_return(long i, atomic64_t *v) \ +{ \ + long new, old, ret; \ + \ + do { \ + old = v->counter; \ + new = old c_op i; \ + ret = arch_cmpxchg(&v->counter, old, new); \ + } while (ret != old); \ + \ + return new; \ +} + +#define ATOMIC64_OP(op, c_op) \ +static inline void arch_atomic64_##op(long i, atomic64_t *v) \ +{ \ + long new, old, ret; \ + \ + do { \ + old = v->counter; \ + new = old c_op i; \ + ret = arch_cmpxchg(&v->counter, old, new); \ + } while (ret != old); \ +} + +#define ATOMIC64_FETCH_OP(op, c_op) \ +static inline long arch_atomic64_fetch_##op(long i, atomic64_t *v) \ +{ \ + long new, old, ret; \ + \ + do { \ + old = v->counter; \ + new = old c_op i; \ + ret = arch_cmpxchg(&v->counter, old, new); \ + } while (ret != old); \ + \ + return old; \ +} + +#define ATOMIC64_OPS(op, c_op) \ + ATOMIC64_OP(op, c_op) \ + ATOMIC64_RETURN_OP(op, c_op) \ + ATOMIC64_FETCH_OP(op, c_op) + +ATOMIC64_OPS(and, &) +ATOMIC64_OPS(or, |) +ATOMIC64_OPS(xor, ^) +ATOMIC64_OPS(add, +) +ATOMIC64_OPS(sub, -) + +#undef ATOMIC64_OPS +#undef ATOMIC64_FETCH_OP +#undef ATOMIC64_OP + +static inline int arch_atomic_add_return(int i, atomic_t *v) +{ + int new, old, ret; + + do { + old = v->counter; + new = old + i; + ret = arch_cmpxchg(&v->counter, old, new); + } while (ret != old); + + return new; +} + +static inline int arch_atomic_sub_return(int i, atomic_t *v) +{ + return arch_atomic_add_return(-i, v); +} + +#include + +#endif /* _ASM_KVX_ATOMIC_H */ diff --git a/arch/kvx/include/asm/barrier.h b/arch/kvx/include/asm/barrier.h new file mode 100644 index 000000000000..371f1c70746d --- /dev/null +++ b/arch/kvx/include/asm/barrier.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2017-2023 Kalray Inc. + * Author(s): Clement Leger + */ + +#ifndef _ASM_KVX_BARRIER_H +#define _ASM_KVX_BARRIER_H + +/* fence is sufficient to guarantee write ordering */ +#define mb() __builtin_kvx_fence() + +#include + +#endif /* _ASM_KVX_BARRIER_H */ diff --git a/arch/kvx/include/asm/bitops.h b/arch/kvx/include/asm/bitops.h new file mode 100644 index 000000000000..c643f4765059 --- /dev/null +++ b/arch/kvx/include/asm/bitops.h @@ -0,0 +1,115 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2017-2023 Kalray Inc. + * Author(s): Clement Leger + * Yann Sionneau + */ + +#ifndef _ASM_KVX_BITOPS_H +#define _ASM_KVX_BITOPS_H + +#ifdef __KERNEL__ + +#ifndef _LINUX_BITOPS_H +#error only can be included directly +#endif + +#include + +static inline int fls(int x) +{ + return 32 - __builtin_kvx_clzw(x); +} + +static inline int fls64(__u64 x) +{ + return 64 - __builtin_kvx_clzd(x); +} + +/** + * __ffs - find first set bit in word + * @word: The word to search + * + * Undefined if no set bit exists, so code should check against 0 first. + */ +static inline unsigned long __ffs(unsigned long word) +{ + return __builtin_kvx_ctzd(word); +} + +/** + * __fls - find last set bit in word + * @word: The word to search + * + * Undefined if no set bit exists, so code should check against 0 first. + */ +static inline unsigned long __fls(unsigned long word) +{ + return 63 - __builtin_kvx_clzd(word); +} + + +/** + * ffs - find first set bit in word + * @x: the word to search + * + * This is defined the same way as the libc and compiler builtin ffs + * routines, therefore differs in spirit from the other bitops. + * + * ffs(value) returns 0 if value is 0 or the position of the first + * set bit if value is nonzero. The first (least significant) bit + * is at position 1. + */ +static inline int ffs(int x) +{ + if (!x) + return 0; + return __builtin_kvx_ctzw(x) + 1; +} + +static inline unsigned int __arch_hweight32(unsigned int w) +{ + unsigned int count; + + asm volatile ("cbsw %0 = %1\n\t;;" + : "=r" (count) + : "r" (w)); + + return count; +} + +static inline unsigned int __arch_hweight64(__u64 w) +{ + unsigned int count; + + asm volatile ("cbsd %0 = %1\n\t;;" + : "=r" (count) + : "r" (w)); + + return count; +} + +static inline unsigned int __arch_hweight16(unsigned int w) +{ + return __arch_hweight32(w & 0xffff); +} + +static inline unsigned int __arch_hweight8(unsigned int w) +{ + return __arch_hweight32(w & 0xff); +} + +#include + +#include +#include + +#include +#include +#include +#include +#include + +#endif + +#endif diff --git a/arch/kvx/include/asm/bitrev.h b/arch/kvx/include/asm/bitrev.h new file mode 100644 index 000000000000..79865081905a --- /dev/null +++ b/arch/kvx/include/asm/bitrev.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2017-2023 Kalray Inc. + * Author(s): Clement Leger + */ + +#ifndef _ASM_KVX_BITREV_H +#define _ASM_KVX_BITREV_H + +#include + +/* Bit reversal constant for matrix multiply */ +#define BIT_REVERSE 0x0102040810204080ULL + +static __always_inline __attribute_const__ u32 __arch_bitrev32(u32 x) +{ + /* Reverse all bits for each bytes and then byte-reverse the 32 LSB */ + return swab32(__builtin_kvx_sbmm8(BIT_REVERSE, x)); +} + +static __always_inline __attribute_const__ u16 __arch_bitrev16(u16 x) +{ + /* Reverse all bits for each bytes and then byte-reverse the 16 LSB */ + return swab16(__builtin_kvx_sbmm8(BIT_REVERSE, x)); +} + +static __always_inline __attribute_const__ u8 __arch_bitrev8(u8 x) +{ + return __builtin_kvx_sbmm8(BIT_REVERSE, x); +} + +#endif diff --git a/arch/kvx/include/asm/cmpxchg.h b/arch/kvx/include/asm/cmpxchg.h new file mode 100644 index 000000000000..51ccb83757cc --- /dev/null +++ b/arch/kvx/include/asm/cmpxchg.h @@ -0,0 +1,170 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2017-2023 Kalray Inc. + * Author(s): Clement Leger + * Yann Sionneau + * Jules Maselbas + */ + +#ifndef _ASM_KVX_CMPXCHG_H +#define _ASM_KVX_CMPXCHG_H + +#include +#include +#include +#include + +/* + * On kvx, we have a boolean compare and swap which means that the operation + * returns only the success of operation. + * If operation succeed, this is simple, we just need to return the provided + * old value. However, if it fails, we need to load the value to return it for + * the caller. If the loaded value is different from the "old" provided by the + * caller, we can return it since it will means it failed. + * However, if for some reason the value we read is equal to the old value + * provided by the caller, we can't simply return it or the caller will think it + * succeeded. So if the value we read is the same as the "old" provided by + * the caller, we try again until either we succeed or we fail with a different + * value than the provided one. + */ + +static inline unsigned int __cmpxchg_u32(unsigned int old, unsigned int new, + volatile unsigned int *ptr) +{ + unsigned int exp = old; + + __builtin_kvx_fence(); + while (exp == old) { + if (__builtin_kvx_acswapw((void *)ptr, new, exp)) + break; /* acswap succeed */ + exp = *ptr; + } + + return exp; +} + +static inline unsigned long __cmpxchg_u64(unsigned long old, unsigned long new, + volatile unsigned long *ptr) +{ + unsigned long exp = old; + + __builtin_kvx_fence(); + while (exp == old) { + if (__builtin_kvx_acswapd((void *)ptr, new, exp)) + break; /* acswap succeed */ + exp = *ptr; + } + + return exp; +} + +extern unsigned long __cmpxchg_called_with_bad_pointer(void) + __compiletime_error("Bad argument size for cmpxchg"); + +static __always_inline unsigned long __cmpxchg(unsigned long old, + unsigned long new, + volatile void *ptr, int size) +{ + switch (size) { + case 4: + return __cmpxchg_u32(old, new, ptr); + case 8: + return __cmpxchg_u64(old, new, ptr); + default: + return __cmpxchg_called_with_bad_pointer(); + } +} + +#define arch_cmpxchg(ptr, old, new) \ + ((__typeof__(*(ptr))) __cmpxchg( \ + (unsigned long)(old), (unsigned long)(new), \ + (ptr), sizeof(*(ptr)))) + +/* + * In order to optimize xchg for 16 byte, we can use insf/extfs if we know the + * bounds. This way, we only take one more bundle than standard xchg. + * We simply do a read modify acswap on a 32 bit word. + */ + +#define __kvx_insf(org, val, start, stop) __asm__ __volatile__( \ + "insf %[_org] = %[_val], %[_stop], %[_start]\n\t;;" \ + : [_org]"+r"(org) \ + : [_val]"r"(val), [_stop]"i"(stop), [_start]"i"(start)) + +#define __kvx_extfz(out, val, start, stop) __asm__ __volatile__( \ + "extfz %[_out] = %[_val], %[_stop], %[_start]\n\t;;" \ + : [_out]"=r"(out) \ + : [_val]"r"(val), [_stop]"i"(stop), [_start]"i"(start)) + +/* Needed for generic qspinlock implementation */ +static inline unsigned int __xchg_u16(unsigned int old, unsigned int new, + volatile unsigned int *ptr) +{ + unsigned int off = ((unsigned long)ptr) % sizeof(unsigned int); + unsigned int val; + + ptr = PTR_ALIGN_DOWN(ptr, sizeof(unsigned int)); + __builtin_kvx_fence(); + do { + old = *ptr; + val = old; + if (off == 0) + __kvx_insf(val, new, 0, 15); + else + __kvx_insf(val, new, 16, 31); + } while (!__builtin_kvx_acswapw((void *)ptr, val, old)); + + if (off == 0) + __kvx_extfz(old, old, 0, 15); + else + __kvx_extfz(old, old, 16, 31); + + return old; +} + +static inline unsigned int __xchg_u32(unsigned int old, unsigned int new, + volatile unsigned int *ptr) +{ + __builtin_kvx_fence(); + do + old = *ptr; + while (!__builtin_kvx_acswapw((void *)ptr, new, old)); + + return old; +} + +static inline unsigned long __xchg_u64(unsigned long old, unsigned long new, + volatile unsigned long *ptr) +{ + __builtin_kvx_fence(); + do + old = *ptr; + while (!__builtin_kvx_acswapd((void *)ptr, new, old)); + + return old; +} + +extern unsigned long __xchg_called_with_bad_pointer(void) + __compiletime_error("Bad argument size for xchg"); + +static __always_inline unsigned long __xchg(unsigned long val, + volatile void *ptr, int size) +{ + switch (size) { + case 2: + return __xchg_u16(0, val, ptr); + case 4: + return __xchg_u32(0, val, ptr); + case 8: + return __xchg_u64(0, val, ptr); + default: + return __xchg_called_with_bad_pointer(); + } +} + +#define arch_xchg(ptr, val) \ + ((__typeof__(*(ptr))) __xchg( \ + (unsigned long)(val), \ + (ptr), sizeof(*(ptr)))) + +#endif