From patchwork Sat Aug 4 09:55:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 10555655 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6A5D4174A for ; Sat, 4 Aug 2018 09:56:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3312E297F1 for ; Sat, 4 Aug 2018 09:56:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 10E0B2A49E; Sat, 4 Aug 2018 09:56:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D9F6C297F1 for ; Sat, 4 Aug 2018 09:56:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=u4t1Z4fk/4UdGXMMVFP3ndpIaLaaqGFUweU4RGllenA=; b=alh PGWHsHikKF0zn5529utIwISXcQarPIKgWOmyAvtOmdhLMsHVdOI1quFjoama91C2Sp9VdfEeLpZfp 1C4JMUOMFewtwE1yk7A/YkaLfTk02RNZ/tz9fsrs9VYR/ZwWoXfLAAKirmGGGcLc4v8OiXudP+9cG pkuYCDwiZCgmCby4+AIAEuD4vu9KULrUara3tJk5tU0//pcq3XpNs5xuasmPKC7OAmeo3JHK1WGzo EgAfhu0RRJMb9yxKR+7UohhpfH2OkEOriSqoB+jkM+MOPBkv747k2mZOqMDBmyAK9vYAyirJsyj3o 6HTxWulNU31gQ3LgzT9G2y/0ZpeLFcQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1fltIS-0003Ff-ER; Sat, 04 Aug 2018 09:56:28 +0000 Received: from mail-ed1-x542.google.com ([2a00:1450:4864:20::542]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1fltIB-0002st-EC for linux-arm-kernel@lists.infradead.org; Sat, 04 Aug 2018 09:56:14 +0000 Received: by mail-ed1-x542.google.com with SMTP id f23-v6so2961670edr.11 for ; Sat, 04 Aug 2018 02:56:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=DWOn5rLy28nn/WtaYUgCTuE1bTc3mCfjEgjh8X04cnA=; b=Ifd5f/62x9vOf/NeUMpzMjpgijAzyc2CS5ixmSVeCXr4RqWx9afpqTreg/H1dI/zjU ax1KezxbkeBKJdFNgssadOL2X0ztxkqPh2sU9RBfk9pOe/bEPFMr5LsmggVTSqaxey4a aiocIVCIWEHR1R+76UZkeogkXyASnebySfWFM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=DWOn5rLy28nn/WtaYUgCTuE1bTc3mCfjEgjh8X04cnA=; b=cJ8XyYVJAWkYz/AtkQ9Rp9llr1rLV8M18pkM9/mLcdxJ0KrJJyD17x2ClBDjy581QV aXtVGN/R0UmPDaig+RW2xCuCEO+JM7Fn7uQzGa5WCj+ajH+Bp9vgo50D+WE7E2iybFiZ 4G5CrGM+kgS83fvK8LP1ytWg5ugosUWB27caau/6u/za2ZTmz37r1z+RirkMXExpUkbH rRIIW3JWSDLmLhPV1piL/fMqJ1rPRUDrwlxT/R3KiVJaXU3xMIcaRG+8H/viAz5Rt4ab 6Xy0wpst8t1hwOhWIT5P0Wk60qdyhUnxrmUh7csq3MJCRzuVvGDeG04Q1qmpgByHym/r WlfA== X-Gm-Message-State: AOUpUlGGw4NfNLDw3XY0a1NDZfYaHFIsntNZXs/iNan9c/ohPLHzbJ5B T66zQd4/wA6g5JRIPrTZY6r3hYuiQrU= X-Google-Smtp-Source: AAOMgpf0kvPWnVdTVF+cK4mlFUST0DwPPfLQLFQGMjTWb1z/nOVZKYA7nwYsZW4oPfY14n4LlE9xaA== X-Received: by 2002:a50:9704:: with SMTP id c4-v6mr10448569edb.246.1533376559403; Sat, 04 Aug 2018 02:55:59 -0700 (PDT) Received: from rev02.home (b80182.upc-b.chello.nl. [212.83.80.182]) by smtp.gmail.com with ESMTPSA id j10-v6sm2704564eds.8.2018.08.04.02.55.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 04 Aug 2018 02:55:58 -0700 (PDT) From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Subject: [RFC PATCH] arm64: lse: provide additional GPR to 'fetch' LL/SC fallback variants Date: Sat, 4 Aug 2018 11:55:53 +0200 Message-Id: <20180804095553.16358-1-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.18.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180804_025611_473446_18B4CC50 X-CRM114-Status: GOOD ( 14.74 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.rutland@arm.com, trong@android.com, Ard Biesheuvel , catalin.marinas@arm.com, ndesaulniers@google.com, will.deacon@arm.com MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP When support for ARMv8.2 LSE atomics is compiled in, the original LL/SC implementations are demoted to fallbacks that are invoked via function calls on systems that do not implement the new instructions. Due to the fact that these function calls may occur from modules that are located further than 128 MB away from their targets in the core kernel, such calls may be indirected via PLT entries, which are permitted to clobber registers x16 and x17. Since we must assume that those registers do not retain their value across a function call to such a LL/SC fallback, and given that those function calls are hidden from the compiler entirely, we must assume that calling any of the LSE atomics routines clobbers x16 and x17 (and x30, for that matter). Fortunately, there is an upside: having two scratch register available permits the compiler to emit many of the LL/SC fallbacks without having to preserve/restore registers on the stack, which would penalise the users of the LL/SC fallbacks even more, given that they are already putting up with the function call overhead. However, the 'fetch' variants need an additional scratch register in order to execute without the need to preserve registers on the stack. So let's give those routines an additional scratch register 'x15' when emitted as a LL/SC fallback, and ensure that the register is marked as clobbered at the associated LSE call sites (but not anywhere else) Signed-off-by: Ard Biesheuvel --- arch/arm64/include/asm/atomic_ll_sc.h | 16 +++++++++------- arch/arm64/include/asm/atomic_lse.h | 12 ++++++------ arch/arm64/include/asm/lse.h | 3 +++ arch/arm64/lib/Makefile | 2 +- 4 files changed, 19 insertions(+), 14 deletions(-) diff --git a/arch/arm64/include/asm/atomic_ll_sc.h b/arch/arm64/include/asm/atomic_ll_sc.h index f5a2d09afb38..7a2ac2900810 100644 --- a/arch/arm64/include/asm/atomic_ll_sc.h +++ b/arch/arm64/include/asm/atomic_ll_sc.h @@ -51,7 +51,8 @@ __LL_SC_PREFIX(atomic_##op(int i, atomic_t *v)) \ " stxr %w1, %w0, %2\n" \ " cbnz %w1, 1b" \ : "=&r" (result), "=&r" (tmp), "+Q" (v->counter) \ - : "Ir" (i)); \ + : "Ir" (i) \ + : __LL_SC_PRESERVE()); \ } \ __LL_SC_EXPORT(atomic_##op); @@ -71,7 +72,7 @@ __LL_SC_PREFIX(atomic_##op##_return##name(int i, atomic_t *v)) \ " " #mb \ : "=&r" (result), "=&r" (tmp), "+Q" (v->counter) \ : "Ir" (i) \ - : cl); \ + : __LL_SC_PRESERVE(cl)); \ \ return result; \ } \ @@ -145,7 +146,8 @@ __LL_SC_PREFIX(atomic64_##op(long i, atomic64_t *v)) \ " stxr %w1, %0, %2\n" \ " cbnz %w1, 1b" \ : "=&r" (result), "=&r" (tmp), "+Q" (v->counter) \ - : "Ir" (i)); \ + : "Ir" (i) \ + : __LL_SC_PRESERVE()); \ } \ __LL_SC_EXPORT(atomic64_##op); @@ -165,7 +167,7 @@ __LL_SC_PREFIX(atomic64_##op##_return##name(long i, atomic64_t *v)) \ " " #mb \ : "=&r" (result), "=&r" (tmp), "+Q" (v->counter) \ : "Ir" (i) \ - : cl); \ + : __LL_SC_PRESERVE(cl)); \ \ return result; \ } \ @@ -242,7 +244,7 @@ __LL_SC_PREFIX(atomic64_dec_if_positive(atomic64_t *v)) "2:" : "=&r" (result), "=&r" (tmp), "+Q" (v->counter) : - : "cc", "memory"); + : __LL_SC_PRESERVE("cc", "memory")); return result; } @@ -268,7 +270,7 @@ __LL_SC_PREFIX(__cmpxchg_case_##name(volatile void *ptr, \ : [tmp] "=&r" (tmp), [oldval] "=&r" (oldval), \ [v] "+Q" (*(unsigned long *)ptr) \ : [old] "Lr" (old), [new] "r" (new) \ - : cl); \ + : __LL_SC_PRESERVE(cl)); \ \ return oldval; \ } \ @@ -316,7 +318,7 @@ __LL_SC_PREFIX(__cmpxchg_double##name(unsigned long old1, \ "2:" \ : "=&r" (tmp), "=&r" (ret), "+Q" (*(unsigned long *)ptr) \ : "r" (old1), "r" (old2), "r" (new1), "r" (new2) \ - : cl); \ + : __LL_SC_PRESERVE(cl)); \ \ return ret; \ } \ diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index f9b0b09153e0..2520f8c2ee4a 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -59,7 +59,7 @@ static inline int atomic_fetch_##op##name(int i, atomic_t *v) \ " " #asm_op #mb " %w[i], %w[i], %[v]") \ : [i] "+r" (w0), [v] "+Q" (v->counter) \ : "r" (x1) \ - : __LL_SC_CLOBBERS, ##cl); \ + : __LL_SC_FETCH_CLOBBERS, ##cl); \ \ return w0; \ } @@ -137,7 +137,7 @@ static inline int atomic_fetch_and##name(int i, atomic_t *v) \ " ldclr" #mb " %w[i], %w[i], %[v]") \ : [i] "+&r" (w0), [v] "+Q" (v->counter) \ : "r" (x1) \ - : __LL_SC_CLOBBERS, ##cl); \ + : __LL_SC_FETCH_CLOBBERS, ##cl); \ \ return w0; \ } @@ -209,7 +209,7 @@ static inline int atomic_fetch_sub##name(int i, atomic_t *v) \ " ldadd" #mb " %w[i], %w[i], %[v]") \ : [i] "+&r" (w0), [v] "+Q" (v->counter) \ : "r" (x1) \ - : __LL_SC_CLOBBERS, ##cl); \ + : __LL_SC_FETCH_CLOBBERS, ##cl); \ \ return w0; \ } @@ -256,7 +256,7 @@ static inline long atomic64_fetch_##op##name(long i, atomic64_t *v) \ " " #asm_op #mb " %[i], %[i], %[v]") \ : [i] "+r" (x0), [v] "+Q" (v->counter) \ : "r" (x1) \ - : __LL_SC_CLOBBERS, ##cl); \ + : __LL_SC_FETCH_CLOBBERS, ##cl); \ \ return x0; \ } @@ -334,7 +334,7 @@ static inline long atomic64_fetch_and##name(long i, atomic64_t *v) \ " ldclr" #mb " %[i], %[i], %[v]") \ : [i] "+&r" (x0), [v] "+Q" (v->counter) \ : "r" (x1) \ - : __LL_SC_CLOBBERS, ##cl); \ + : __LL_SC_FETCH_CLOBBERS, ##cl); \ \ return x0; \ } @@ -406,7 +406,7 @@ static inline long atomic64_fetch_sub##name(long i, atomic64_t *v) \ " ldadd" #mb " %[i], %[i], %[v]") \ : [i] "+&r" (x0), [v] "+Q" (v->counter) \ : "r" (x1) \ - : __LL_SC_CLOBBERS, ##cl); \ + : __LL_SC_FETCH_CLOBBERS, ##cl); \ \ return x0; \ } diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h index 8262325e2fc6..7101a7e6df1c 100644 --- a/arch/arm64/include/asm/lse.h +++ b/arch/arm64/include/asm/lse.h @@ -30,6 +30,8 @@ __asm__(".arch_extension lse"); /* Macro for constructing calls to out-of-line ll/sc atomics */ #define __LL_SC_CALL(op) "bl\t" __stringify(__LL_SC_PREFIX(op)) "\n" #define __LL_SC_CLOBBERS "x16", "x17", "x30" +#define __LL_SC_FETCH_CLOBBERS "x15", __LL_SC_CLOBBERS +#define __LL_SC_PRESERVE(...) "x15", ##__VA_ARGS__ /* In-line patching at runtime */ #define ARM64_LSE_ATOMIC_INSN(llsc, lse) \ @@ -49,6 +51,7 @@ __asm__(".arch_extension lse"); #define __LL_SC_INLINE static inline #define __LL_SC_PREFIX(x) x #define __LL_SC_EXPORT(x) +#define __LL_SC_PRESERVE(x...) x #define ARM64_LSE_ATOMIC_INSN(llsc, lse) llsc diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile index 137710f4dac3..be69c4077e75 100644 --- a/arch/arm64/lib/Makefile +++ b/arch/arm64/lib/Makefile @@ -16,7 +16,7 @@ CFLAGS_atomic_ll_sc.o := -fcall-used-x0 -ffixed-x1 -ffixed-x2 \ -ffixed-x3 -ffixed-x4 -ffixed-x5 -ffixed-x6 \ -ffixed-x7 -fcall-saved-x8 -fcall-saved-x9 \ -fcall-saved-x10 -fcall-saved-x11 -fcall-saved-x12 \ - -fcall-saved-x13 -fcall-saved-x14 -fcall-saved-x15 \ + -fcall-saved-x13 -fcall-saved-x14 \ -fcall-saved-x18 -fomit-frame-pointer CFLAGS_REMOVE_atomic_ll_sc.o := -pg GCOV_PROFILE_atomic_ll_sc.o := n