From patchwork Fri Apr 19 13:53:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Jones X-Patchwork-Id: 13636411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 36DB4C4345F for ; Fri, 19 Apr 2024 13:53:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=sGNnvc2gbFB1+ZfJaHJGXk3n8i29HMaO4e0cmg92SYA=; b=ggXojA7GXfESW0 rQIu95c/lMkTrCKtSVJfXhaabiVstHwblt3ORelIbvp4WYwf7puYjk6o9RcfC+nlwhVaZApaqCiOa vtTc6VQ2uWcaLV72qVsgFElvFMxxbmE0OU/g1vAFcPcWG2cKYMhrymJVqBkPrBmMr/atq5OEg/1tD ey0MVcU26tj0KvaXXrSRYvdqfyvq3JIpDgkeOEcdppMOepgyCBnB6q7fbc0Phv+K2SZ9DmtUmfurC BOQmSbGhxqhCVLyBSEXi4No3mJx5OD7p2+kkyuo8gOU3G3OwWGWrB4eEcTINQfpiNWoEAV1PBi8nJ YIIyzM01odqRKNpSCsyw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rxogM-00000005pJA-126J; Fri, 19 Apr 2024 13:53:38 +0000 Received: from mail-ed1-x52c.google.com ([2a00:1450:4864:20::52c]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rxogE-00000005pCR-28LN for linux-riscv@lists.infradead.org; Fri, 19 Apr 2024 13:53:32 +0000 Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-571c22d9de4so2213632a12.3 for ; Fri, 19 Apr 2024 06:53:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1713534808; x=1714139608; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nEizurJcSR1aY1vBLFtX+JNkUfZx2QKjVF6QEsuG1UY=; b=XR636xQoQ1f/dyBx1t2HsNEXwDFWXqZfCZJLETFuKaaDZwfx5Dtm2ec2YuSCcEjJZv 1z7q8x1ZEs0kc0Agh7cWuEeeM5LQhvFw8YIFTbPsCOtFLEP9Dc+S2AuxG7xQwZNzW6xq guWmIybbyvI/PVj1ZOuSer7P0M+ew9Qd1atJn6FQLBAJ0nji0u2b1jeA/y6d4PYbbwpM NTyCtkNL/DaQLT7LXuZG9OJNG4j/xMMs2RE12arMltHsEQFiQBYV6zW8Qx0WOJfzzMvj 9ybHQtVeI+0ivxW5jOceEu4+XhPaCNeV6DOyKJqj3x+9fb1KggjHQq4YfoeCXTweqUOu MTGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713534808; x=1714139608; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nEizurJcSR1aY1vBLFtX+JNkUfZx2QKjVF6QEsuG1UY=; b=q3OcuKfGx0nd43Fwqm3tBAJIUiOkYBksdYoEHe+rj9k/rznvnJy6NPBSBGuquJIjDe UwLPcYYq1q3QC6+N7iyltsYGTsgfwT4FpH+G7Mq16gk6HmqhCxAC5gmZVD3WMRZWuyg9 Dj4mO5+Sb++FDrbgclOdxEfi8Ly3JYUy2pn0vCtRgTySEpm7sdtOXh9H067QcLwbieB/ wrX2egtF683VKtUSheXpL4NdlF85Or8TNfM2Y79vZhh4nFhRAmK0svPQY5IFza+PP6V8 JqYMQhU7Z1oWTPKEYt8tp/SX+J4DCsTAZpiYrMc6KE+6L+uAgeFODQaCIocssN0b9bB6 N8Xw== X-Gm-Message-State: AOJu0Yxs/uk6l1/XQr5IRxi54dntYu3VxHQSP7LQkgJKrIJoY9vOxfEG oadAvg0fQZi61FsGoqt6wda1JyJ1mBxu6evAUu7u9oPzQIOOQ3f/CQwxTTliMRbOhAQbTAHLfq9 O+qs= X-Google-Smtp-Source: AGHT+IGkQt2hO18a+SNSAYfEOPucXpHAMV1WR0RrkuDvLuCw0MpZGKjuj++zRhwHCrnJC9FEvEAlqg== X-Received: by 2002:a50:baad:0:b0:56e:323b:d7e7 with SMTP id x42-20020a50baad000000b0056e323bd7e7mr1751219ede.34.1713534807639; Fri, 19 Apr 2024 06:53:27 -0700 (PDT) Received: from localhost (2001-1ae9-1c2-4c00-20f-c6b4-1e57-7965.ip6.tmcz.cz. [2001:1ae9:1c2:4c00:20f:c6b4:1e57:7965]) by smtp.gmail.com with ESMTPSA id d6-20020a05640208c600b0056e72c4a330sm2148891edz.41.2024.04.19.06.53.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Apr 2024 06:53:27 -0700 (PDT) From: Andrew Jones To: linux-riscv@lists.infradead.org, kvm-riscv@lists.infradead.org, devicetree@vger.kernel.org Cc: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, conor.dooley@microchip.com, anup@brainfault.org, atishp@atishpatra.org, robh@kernel.org, krzysztof.kozlowski+dt@linaro.org, conor+dt@kernel.org, christoph.muellner@vrull.eu, heiko@sntech.de, charlie@rivosinc.com, David.Laight@ACULAB.COM, parri.andrea@gmail.com, luxu.kernel@bytedance.com Subject: [PATCH v2 3/6] riscv: Add Zawrs support for spinlocks Date: Fri, 19 Apr 2024 15:53:25 +0200 Message-ID: <20240419135321.70781-11-ajones@ventanamicro.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240419135321.70781-8-ajones@ventanamicro.com> References: <20240419135321.70781-8-ajones@ventanamicro.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240419_065330_689051_DF2D410C X-CRM114-Status: GOOD ( 20.06 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Christoph Müllner RISC-V code uses the generic ticket lock implementation, which calls the macros smp_cond_load_relaxed() and smp_cond_load_acquire(). Introduce a RISC-V specific implementation of smp_cond_load_relaxed() which applies WRS.NTO of the Zawrs extension in order to reduce power consumption while waiting and allows hypervisors to enable guests to trap while waiting. smp_cond_load_acquire() doesn't need a RISC-V specific implementation as the generic implementation is based on smp_cond_load_relaxed() and smp_acquire__after_ctrl_dep() sufficiently provides the acquire semantics. This implementation is heavily based on Arm's approach which is the approach Andrea Parri also suggested. The Zawrs specification can be found here: https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc Signed-off-by: Christoph Müllner Co-developed-by: Andrew Jones Signed-off-by: Andrew Jones --- arch/riscv/Kconfig | 13 ++++++++ arch/riscv/include/asm/barrier.h | 45 ++++++++++++++++++--------- arch/riscv/include/asm/cmpxchg.h | 51 +++++++++++++++++++++++++++++++ arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/include/asm/insn-def.h | 2 ++ arch/riscv/kernel/cpufeature.c | 1 + 6 files changed, 98 insertions(+), 15 deletions(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 7427d8088337..34bbe6b70546 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -578,6 +578,19 @@ config RISCV_ISA_V_PREEMPTIVE preemption. Enabling this config will result in higher memory consumption due to the allocation of per-task's kernel Vector context. +config RISCV_ISA_ZAWRS + bool "Zawrs extension support for more efficient busy waiting" + depends on RISCV_ALTERNATIVE + default y + help + The Zawrs extension defines instructions to be used in polling loops + which allow a hart to enter a low-power state or to trap to the + hypervisor while waiting on a store to a memory location. Enable the + use of these instructions in the kernel when the Zawrs extension is + detected at boot. + + If you don't know what to do here, say Y. + config TOOLCHAIN_HAS_ZBB bool default y diff --git a/arch/riscv/include/asm/barrier.h b/arch/riscv/include/asm/barrier.h index 880b56d8480d..e1d9bf1deca6 100644 --- a/arch/riscv/include/asm/barrier.h +++ b/arch/riscv/include/asm/barrier.h @@ -11,6 +11,7 @@ #define _ASM_RISCV_BARRIER_H #ifndef __ASSEMBLY__ +#include #include #define nop() __asm__ __volatile__ ("nop") @@ -28,21 +29,6 @@ #define __smp_rmb() RISCV_FENCE(r, r) #define __smp_wmb() RISCV_FENCE(w, w) -#define __smp_store_release(p, v) \ -do { \ - compiletime_assert_atomic_type(*p); \ - RISCV_FENCE(rw, w); \ - WRITE_ONCE(*p, v); \ -} while (0) - -#define __smp_load_acquire(p) \ -({ \ - typeof(*p) ___p1 = READ_ONCE(*p); \ - compiletime_assert_atomic_type(*p); \ - RISCV_FENCE(r, rw); \ - ___p1; \ -}) - /* * This is a very specific barrier: it's currently only used in two places in * the kernel, both in the scheduler. See include/linux/spinlock.h for the two @@ -70,6 +56,35 @@ do { \ */ #define smp_mb__after_spinlock() RISCV_FENCE(iorw, iorw) +#define __smp_store_release(p, v) \ +do { \ + compiletime_assert_atomic_type(*p); \ + RISCV_FENCE(rw, w); \ + WRITE_ONCE(*p, v); \ +} while (0) + +#define __smp_load_acquire(p) \ +({ \ + typeof(*p) ___p1 = READ_ONCE(*p); \ + compiletime_assert_atomic_type(*p); \ + RISCV_FENCE(r, rw); \ + ___p1; \ +}) + +#ifdef CONFIG_RISCV_ISA_ZAWRS +#define smp_cond_load_relaxed(ptr, cond_expr) ({ \ + typeof(ptr) __PTR = (ptr); \ + __unqual_scalar_typeof(*ptr) VAL; \ + for (;;) { \ + VAL = READ_ONCE(*__PTR); \ + if (cond_expr) \ + break; \ + __cmpwait_relaxed(ptr, VAL); \ + } \ + (typeof(*ptr))VAL; \ +}) +#endif + #include #endif /* __ASSEMBLY__ */ diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h index 2fee65cc8443..0926ac7f4ca6 100644 --- a/arch/riscv/include/asm/cmpxchg.h +++ b/arch/riscv/include/asm/cmpxchg.h @@ -8,7 +8,10 @@ #include +#include #include +#include +#include #define __xchg_relaxed(ptr, new, size) \ ({ \ @@ -359,4 +362,52 @@ arch_cmpxchg_relaxed((ptr), (o), (n)); \ }) +#ifdef CONFIG_RISCV_ISA_ZAWRS +static __always_inline void __cmpwait(volatile void *ptr, + unsigned long val, + int size) +{ + unsigned long tmp; + + asm goto(ALTERNATIVE("j %l[no_zawrs]", "nop", + 0, RISCV_ISA_EXT_ZAWRS, 1) + : : : : no_zawrs); + + switch (size) { + case 4: + asm volatile( + " lr.w %0, %1\n" + " xor %0, %0, %2\n" + " bnez %0, 1f\n" + ZAWRS_WRS_NTO "\n" + "1:" + : "=&r" (tmp), "+A" (*(u32 *)ptr) + : "r" (val)); + break; +#if __riscv_xlen == 64 + case 8: + asm volatile( + " lr.d %0, %1\n" + " xor %0, %0, %2\n" + " bnez %0, 1f\n" + ZAWRS_WRS_NTO "\n" + "1:" + : "=&r" (tmp), "+A" (*(u64 *)ptr) + : "r" (val)); + break; +#endif + default: + BUILD_BUG(); + } + + return; + +no_zawrs: + asm volatile(RISCV_PAUSE : : : "memory"); +} + +#define __cmpwait_relaxed(ptr, val) \ + __cmpwait((ptr), (unsigned long)(val), sizeof(*(ptr))) +#endif + #endif /* _ASM_RISCV_CMPXCHG_H */ diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index e17d0078a651..5b358c3cf212 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -81,6 +81,7 @@ #define RISCV_ISA_EXT_ZTSO 72 #define RISCV_ISA_EXT_ZACAS 73 #define RISCV_ISA_EXT_XANDESPMU 74 +#define RISCV_ISA_EXT_ZAWRS 75 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/include/asm/insn-def.h b/arch/riscv/include/asm/insn-def.h index 64dffaa21bfa..9a913010cdd9 100644 --- a/arch/riscv/include/asm/insn-def.h +++ b/arch/riscv/include/asm/insn-def.h @@ -197,5 +197,7 @@ RS1(base), SIMM12(4)) #define RISCV_PAUSE ".4byte 0x100000f" +#define ZAWRS_WRS_NTO ".4byte 0x00d00073" +#define ZAWRS_WRS_STO ".4byte 0x01d00073" #endif /* __ASM_INSN_DEF_H */ diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 3ed2359eae35..02de9eaa3f42 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -257,6 +257,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE), __RISCV_ISA_EXT_DATA(zihpm, RISCV_ISA_EXT_ZIHPM), __RISCV_ISA_EXT_DATA(zacas, RISCV_ISA_EXT_ZACAS), + __RISCV_ISA_EXT_DATA(zawrs, RISCV_ISA_EXT_ZAWRS), __RISCV_ISA_EXT_DATA(zfa, RISCV_ISA_EXT_ZFA), __RISCV_ISA_EXT_DATA(zfh, RISCV_ISA_EXT_ZFH), __RISCV_ISA_EXT_DATA(zfhmin, RISCV_ISA_EXT_ZFHMIN),