From patchwork Sat Sep 28 13:51:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathieu Desnoyers X-Patchwork-Id: 13814688 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D712CF6497 for ; Sat, 28 Sep 2024 13:54:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C661E6B0201; Sat, 28 Sep 2024 09:53:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B9FF36B0204; Sat, 28 Sep 2024 09:53:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86B8F6B0201; Sat, 28 Sep 2024 09:53:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 51FCC6B0202 for ; Sat, 28 Sep 2024 09:53:56 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 127CD1C76E9 for ; Sat, 28 Sep 2024 13:53:56 +0000 (UTC) X-FDA: 82614290472.20.FE44179 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by imf06.hostedemail.com (Postfix) with ESMTP id 76DE918000D for ; Sat, 28 Sep 2024 13:53:53 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b="u/VZHw1O"; dmarc=pass (policy=none) header.from=efficios.com; spf=pass (imf06.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727531615; a=rsa-sha256; cv=none; b=Oqulrm9xdMI9rFO+ZagpDvL6/T39VAgwfyeFe0jvIwYnLaTDOB6AyReNDyGRkthrtDLvn9 xRHcdvGP5mFRNWzpx6vSg3s0etbmuMKtMAw1KcZvo+APchRKkHNjP/I2qlT9xETDcp8Uxg 7sAboNTPveR/IDQpNHypq+XQPG0Vcis= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b="u/VZHw1O"; dmarc=pass (policy=none) header.from=efficios.com; spf=pass (imf06.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727531615; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5rPVGIKY1WuB3urRvSd69GPZZ04kT+aMeZxpF1wbYm4=; b=VA2QoS0Z3FIPTCRp22kxWLxvKCIt0mry6db6kWjHSmdj444H9GKbEnK+iECUGy0UKBDzdp uWkJLXBC2eGI+XRDsv/90ow6kcG+b/QE81hZyB7Cv9ZK2Q1Y+LHhbMPs352W/HsfYK0Ehy uatVGL2yJL5YZ9sACRssUgwK1p92BFY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1727531632; bh=Yoa1DIcV0GIQnMAvyo+KjzEmi28ghhiEBWS5v4mrukA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=u/VZHw1OSex3HiP1nB/019K3dUqOdR3fLwG+PFqz5kjTu+u8bcZgLZJPQdyu1RP1Z 4nBIeK3PByR5f51paXn+oRvKMbngH0Wbv2O7xrjoojShmJS6IoMygb+Xy8BnUf5iSH gEpEPRsJhW3ZpwYI8R3012VQE07oUoLoXdVQL9CM5s1+9goUwG8YIDQaivZBdA30lo 7FqF1UHUPbBawL51Pta6ZfvWTRBu3la9C0P4Acp6wa8SWckijYCPOq2xYrQUAfqGse /677HPW+5WA60VD8EKgnRjGmqkFIhuNTeXOv7NBdt8t4V4igWiCUJ5nh4YCBc2XmmW KDvjMvgbIsB4w== Received: from thinkos.internal.efficios.com (unknown [IPv6:2606:6d00:100:4000:cacb:9855:de1f:ded2]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4XG8280b5dz1c1M; Sat, 28 Sep 2024 09:53:52 -0400 (EDT) From: Mathieu Desnoyers To: Linus Torvalds Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Greg Kroah-Hartman , Sebastian Andrzej Siewior , "Paul E. McKenney" , Will Deacon , Peter Zijlstra , Boqun Feng , Alan Stern , John Stultz , Neeraj Upadhyay , Frederic Weisbecker , Joel Fernandes , Josh Triplett , Uladzislau Rezki , Steven Rostedt , Lai Jiangshan , Zqiang , Ingo Molnar , Waiman Long , Mark Rutland , Thomas Gleixner , Vlastimil Babka , maged.michael@gmail.com, Mateusz Guzik , Gary Guo , Jonas Oberhauser , rcu@vger.kernel.org, linux-mm@kvack.org, lkmm@lists.linux.dev Subject: [PATCH 1/2] compiler.h: Introduce ptr_eq() to preserve address dependency Date: Sat, 28 Sep 2024 09:51:27 -0400 Message-Id: <20240928135128.991110-2-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240928135128.991110-1-mathieu.desnoyers@efficios.com> References: <20240928135128.991110-1-mathieu.desnoyers@efficios.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: hpwmgssh5d8738arku8afkqnxb3qsjb6 X-Rspamd-Queue-Id: 76DE918000D X-Rspamd-Server: rspam02 X-HE-Tag: 1727531633-920753 X-HE-Meta: U2FsdGVkX1/YwJ6obcTy4VHM9gGrXKs8g9hidruMajpZfmUQMowGeFceSR6Bck9OLVH+lLg+VraT90Y1cQo1EMHhObd3MOU2YQD94KLOZohxJXA1EsC96CG6LxuTe5XgFVNcDnLGC87hkSQmZ+U0us7T1o+hpQ47JJjrJGFH0XHkxJOXtHNKaS5k0OlVVhBvIrgwAbtyzbmm7Ty94+nSRhNJHhNhmXD5wgnjVjg/lyIoSkP/2ghnE3SbMQvudVir0DBbLu/my4LqOIKPA1gsxbEwuFKZ8g9D8mb401e1KtNBRihmujyaendbLykOsrTfAeo444QotDQrb5qYOgTkn3xKS5NEntOilLVLdXtCfF6Ho7ny8ss7SASCm3JX583G5F439/N91mXzZKRt0V1lnN18psrcIm0f6av636VbMb//wqSrkmIkzEMmelnAUNr3lYqYAxxKWQOa6IaRol6VsN8Wy+DDESlnFBoTBcLAWapvR78EFXqvmIdolttq3Wcstt9hETKLQGCLFUu8538SUGWqSseiLGxYLD4vDmgIVInwMiO4mUC6o+FNfXGw2HI5QISSg+hEAfuDuvbeCvnmB0kW25WkBSSx+BsSj8qSHHlrGRuvFm2yfzUFVP0j7ijOTZcZHv+teRIXmuB0gtCBeViHmlz3KWf5G4hTOSvGXKvxj7Bfs3bjSqxzbad/H0Jpmu/T/7D4c3PK1/OMRS+2dQ9cKCKiiTskUbJA9pw6HEGLg4xJ6Uv9VwZFvs8Upt9khXswVqk+sL+iSvbS0A1b+ONhKLR+OLQmGRF23cyfriJu9929JBdXMK4oDrJlRpv8Ruir10jq6FmVZS16pr2yNY2aWfw4sKoy/N6/aQHjKBKm3C3jipBfL7suh9uJ+adGfKJSphgaknU8ujLJfrmHzH1xFO+Y0PeguxaS9DyclOkIO1Bkah5PZlBOT/yKO6tNxENahUlI1FMVslRGPnZ bMUrzLyj tNOUiSoqCIch0wB3PurvOAUHjBkPYkKW6ofGukR1n8uhi24SjZmMkD6vIfBbdY6cl1/3wbL+hy/OMKxHphZmicW87PeEM3yPA4aDZCKF0hEMQsRNccl6v6eMnrkY6VthLLMMsvsyTn4ib7Gqgl4w0Fav6+exspehfd/Lt6TVzV8b97jR/ImU3iL5j6eM4QBvwL1bugwBNh4Lw2FlZJH32jA9s0Gm38LsHJ7LqeMjgcBRuIlBss6Sho7X7bywn5iJQssVV6BYvOmkHVyPekvG9N6OFX5GZlpIWF5oSC+gDjqLPvCzUGIru1IthhWfc85xotXD37uAoQJspGsV7DQBl4CuUE9Sbd3W4TbtEWPEImR8FzFMewG370QQ+fJX7RLasphpG2osdpsOZGIRgr+iFv9RLjOSaOZIRIY+bKVLqNvFZkL0OJ+jqWg6l02tcVuQIHOKRcwfgMU9TmlIaRQQpuYosQOnt+/Ds5pcj2l4NTOL6nhre15e43NOutpU5J6tvArBA5Cla0Hg3g2+8s7yVSj+sR6tEejQfksOKNYgJyqoRwnvdMNS2ySaNrKgRqEVyE6e2nkdlJGATwLzVNyYVNBjEDdW0kqqb4udx/nrnV+7AMqMQi0E0/sKaohhFytuOa/aasz8WPlhH9qSY3PJzfeNa52O/z2OgiXFX0+4I2NmLu4XXDMSaDh8ZqlhlPM/octnpAbJyAQa5Yok7C3LOOgLsdtHmOQAj44DPCWRMcCAQCTTmvdinAiza+frRAS9R+vAmsurIch+5btWkHyaHrdRoGtlBoZW/tzfw7lr8SzLvwGnTraJclJlmAt1lAgJ1cIpHi20dWFUGC0OGiptzGP0Yd+IaTRyDUimhn7t6SX5IeQlTqK4HYD2DSiP2XQGTm9ROM2jD3N37af+eaj0mqLFGTCVS8jqGEGU2a4f0RhvOvouwmIFFrPU9QcP+ZZ4R06Le3DAwBv/rbWAJCi+39HIPL8Hk fFDcYqeQ OQ0SwGhmcSUyU5WRlkb5jey/p3mmv24jZZpVcHeHFbpWHuNXXYyXcfYrv51y5n+Gv97r/SBmx/QMUoJJVHf5fqn0EAoJFBiuqrPNJo8KYA4U4ymEL5Nj+9JdyZDWaXSHME0cIGkDNwNw6fb6NjrI1A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Compiler CSE and SSA GVN optimizations can cause the address dependency of addresses returned by rcu_dereference to be lost when comparing those pointers with either constants or previously loaded pointers. Introduce ptr_eq() to compare two addresses while preserving the address dependencies for later use of the address. It should be used when comparing an address returned by rcu_dereference(). This is needed to prevent the compiler CSE and SSA GVN optimizations from replacing the registers holding @a or @b based on their equality, which does not preserve address dependencies and allows the following misordering speculations: - If @b is a constant, the compiler can issue the loads which depend on @a before loading @a. - If @b is a register populated by a prior load, weakly-ordered CPUs can speculate loads which depend on @a before loading @a. The same logic applies with @a and @b swapped. The compiler barrier() is ineffective at fixing this issue. It does not prevent the compiler CSE from losing the address dependency: int fct_2_volatile_barriers(void) { int *a, *b; do { a = READ_ONCE(p); asm volatile ("" : : : "memory"); b = READ_ONCE(p); } while (a != b); asm volatile ("" : : : "memory"); <----- barrier() return *b; } With gcc 14.2 (arm64): fct_2_volatile_barriers: adrp x0, .LANCHOR0 add x0, x0, :lo12:.LANCHOR0 .L2: ldr x1, [x0] <------ x1 populated by first load. ldr x2, [x0] cmp x1, x2 bne .L2 ldr w0, [x1] <------ x1 is used for access which should depend on b. ret On weakly-ordered architectures, this lets CPU speculation use the result from the first load to speculate "ldr w0, [x1]" before "ldr x2, [x0]". Based on the RCU documentation, the control dependency does not prevent the CPU from speculating loads. Suggested-by: Linus Torvalds Suggested-by: Boqun Feng Signed-off-by: Mathieu Desnoyers Reviewed-by: Boqun Feng Acked-by: "Paul E. McKenney" Cc: Greg Kroah-Hartman Cc: Sebastian Andrzej Siewior Cc: "Paul E. McKenney" Cc: Will Deacon Cc: Peter Zijlstra Cc: Boqun Feng Cc: Alan Stern Cc: John Stultz Cc: Neeraj Upadhyay Cc: Linus Torvalds Cc: Boqun Feng Cc: Frederic Weisbecker Cc: Joel Fernandes Cc: Josh Triplett Cc: Uladzislau Rezki Cc: Steven Rostedt Cc: Lai Jiangshan Cc: Zqiang Cc: Ingo Molnar Cc: Waiman Long Cc: Mark Rutland Cc: Thomas Gleixner Cc: Vlastimil Babka Cc: maged.michael@gmail.com Cc: Mateusz Guzik Cc: Gary Guo Cc: Jonas Oberhauser Cc: rcu@vger.kernel.org Cc: linux-mm@kvack.org Cc: lkmm@lists.linux.dev --- include/linux/compiler.h | 62 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 2df665fa2964..f26705c267e8 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -186,6 +186,68 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, __asm__ ("" : "=r" (var) : "0" (var)) #endif +/* + * Compare two addresses while preserving the address dependencies for + * later use of the address. It should be used when comparing an address + * returned by rcu_dereference(). + * + * This is needed to prevent the compiler CSE and SSA GVN optimizations + * from replacing the registers holding @a or @b based on their + * equality, which does not preserve address dependencies and allows the + * following misordering speculations: + * + * - If @b is a constant, the compiler can issue the loads which depend + * on @a before loading @a. + * - If @b is a register populated by a prior load, weakly-ordered + * CPUs can speculate loads which depend on @a before loading @a. + * + * The same logic applies with @a and @b swapped. + * + * Return value: true if pointers are equal, false otherwise. + * + * The compiler barrier() is ineffective at fixing this issue. It does + * not prevent the compiler CSE from losing the address dependency: + * + * int fct_2_volatile_barriers(void) + * { + * int *a, *b; + * + * do { + * a = READ_ONCE(p); + * asm volatile ("" : : : "memory"); + * b = READ_ONCE(p); + * } while (a != b); + * asm volatile ("" : : : "memory"); <-- barrier() + * return *b; + * } + * + * With gcc 14.2 (arm64): + * + * fct_2_volatile_barriers: + * adrp x0, .LANCHOR0 + * add x0, x0, :lo12:.LANCHOR0 + * .L2: + * ldr x1, [x0] <-- x1 populated by first load. + * ldr x2, [x0] + * cmp x1, x2 + * bne .L2 + * ldr w0, [x1] <-- x1 is used for access which should depend on b. + * ret + * + * On weakly-ordered architectures, this lets CPU speculation use the + * result from the first load to speculate "ldr w0, [x1]" before + * "ldr x2, [x0]". + * Based on the RCU documentation, the control dependency does not + * prevent the CPU from speculating loads. + */ +static __always_inline +int ptr_eq(const volatile void *a, const volatile void *b) +{ + OPTIMIZER_HIDE_VAR(a); + OPTIMIZER_HIDE_VAR(b); + return a == b; +} + #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) /**