From patchwork Mon Jan 13 15:59:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 13937717 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BAFEC02184 for ; Mon, 13 Jan 2025 16:00:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1DB2F6B0089; Mon, 13 Jan 2025 11:00:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 18C1B6B008A; Mon, 13 Jan 2025 11:00:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 005056B0092; Mon, 13 Jan 2025 11:00:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D8A3D6B0089 for ; Mon, 13 Jan 2025 11:00:31 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 380B8AF42C for ; Mon, 13 Jan 2025 16:00:31 +0000 (UTC) X-FDA: 83002891062.03.3A9B90E Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com [209.85.128.46]) by imf02.hostedemail.com (Postfix) with ESMTP id DC86B80035 for ; Mon, 13 Jan 2025 16:00:28 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Oa7u1hG2; spf=pass (imf02.hostedemail.com: domain of ubizjak@gmail.com designates 209.85.128.46 as permitted sender) smtp.mailfrom=ubizjak@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736784029; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9ll3EDDTY/6Dh7a8MWuMvGIgfSvEqbbqhDX1OHStQPk=; b=MfupM3qFYnH8/kccd9nTI8OZ278aBFF5ofLrUiPfd6PGMUL1uaJJ3gtI9pkzTgeHjSNQqm 2EKXmkBnkjF8RJIdedpvDMp4+pwhUf9DDUaG1tGLKMJgpX5djLfZRBppM7jw+y/JX991Qi MqJkqabALKZq2UzFY1+idZhn5aLE/TU= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Oa7u1hG2; spf=pass (imf02.hostedemail.com: domain of ubizjak@gmail.com designates 209.85.128.46 as permitted sender) smtp.mailfrom=ubizjak@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736784029; a=rsa-sha256; cv=none; b=4Z4vbRe9xnAgGPML/lE4SE1epvTemdgkchj99p+OpRg3wGQJEdJfsqcfTVShbHfbFLXOFQ Z9weWaYgNZTEsaD06P0BYdjAxEarMMkRT9dvCLv2XQ68ptfUTRBe6ygLD97ZF6vAnq0yNr Jnrm5pPOV/8jYNlVaicatR9fqZQiLZk= Received: by mail-wm1-f46.google.com with SMTP id 5b1f17b1804b1-4361f796586so46685715e9.3 for ; Mon, 13 Jan 2025 08:00:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736784027; x=1737388827; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9ll3EDDTY/6Dh7a8MWuMvGIgfSvEqbbqhDX1OHStQPk=; b=Oa7u1hG2t0BEN88OITRBlWvj3/cKcJHscLxDaLeQcfCAsgLyj8IihHRC/v1D3/WLcf lmn2kIUhSsSFkTX19I5WYSR+43zI4mRCxeTpav2dCHdWrjxfa1gl2XNzhz3+iCBhBiYX H0CtZ0FRU89nvVHd9/GfZYs71oT8o8ObtluMsUAcwRFhGsAM2Z/SW6AVQ297ZqOqyLEd SnshAwDVoziGE7tzGIRdSoqd2fW+c1DiiVyss+E1EP3xx0NOXzethGY3Cs5JPSvjNpnC 0GZ1KfjcTFsBnp1aGWO8CJ66+nzhxSDdRAEhtDSqwePKhRefNJEtBxWrIPpdMb6VgZL4 p3iA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736784027; x=1737388827; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9ll3EDDTY/6Dh7a8MWuMvGIgfSvEqbbqhDX1OHStQPk=; b=Ghi3luxnFfF9KI9APF+ysAYfc76z6nRwy2DOCy4ws39KiyFhVsax9d7fJ62PDQBMuW QyVp5Cwr9Qkt19hss2xi3BqaL5IZ+4X1ZsIsfSP7BVyEsQfTf4a1g6u5hjgrWUKNaZRk pOsHSkaLhvEGwfi1REow3auRnAEiqRjDbWFAR+9FAxX7+PpCk/tCRVND2kHmK1rOOlCw QdqJGQo82i0ABtUNNWKu7g6EYl7XTLi9BijqpaRowRaMmwYHO7BzZTEG7UI/K0WvIGA2 8c/p6174C5fC79K/EaLJP8kuQx1f4qRm78ihBHCBccaysClCYLmMp7MdaiiFyJVK83rE N+mg== X-Forwarded-Encrypted: i=1; AJvYcCW2MbgbpgGbSSI34pSfUQ34LGeC7d314uUujMPuqBWZ3QLyj1m2MlLQtY7uPdBsaM7VZad3kukZBA==@kvack.org X-Gm-Message-State: AOJu0YwfkAuLBVF9bQsVpEz4ECpqJVTVy/8jKJ18zfUa3SYe6pAu1Vma ldm7Zx+GoQkvTbVKzKqnu+cNXa6trXw0eZCbwbPiJFS8wYUlZisi X-Gm-Gg: ASbGncuW56qyPyxS5e43HBJ3/UI+LDgSbinushxycJRxyZlY4ouuUe3qwuRrKBWdkMG 0xpYZeJHhlu9uPgRZ2TDB9jTh7Hi6Bn3jsngC4zJovRGEgiHOATKBnSPLM9O107pbeXpxYET2xm DfRChGZtFNePs4L7Qzso/cqnmAWup4bUk8p5L3Lnl0U2hu6z8fOMofuUcXxjIB9sqlNAX8nv1ec 1TIX76l3fmUqTavFyxst6A92yLRnF9jwe0jvtvBMemARySKJEocQNaNVynVByl5tw== X-Google-Smtp-Source: AGHT+IFD65QBgxATFuwNemifx+yeADRez6BPP1B1Q+LZfltC7d9jW5YOIlY9fyW+byv/o/xdAkJzuw== X-Received: by 2002:a05:6000:712:b0:385:ee3f:5cbf with SMTP id ffacd0b85a97d-38a87305369mr17581642f8f.20.1736784026987; Mon, 13 Jan 2025 08:00:26 -0800 (PST) Received: from localhost.localdomain ([46.248.82.114]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ab2c956472fsm515944166b.119.2025.01.13.08.00.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jan 2025 08:00:26 -0800 (PST) From: Uros Bizjak To: x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Uros Bizjak , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Dennis Zhou , Tejun Heo , Christoph Lameter , "Peter Zijlstra (Intel)" Subject: [PATCH 2/2] x86/locking: Use asm_inline for {,try_}cmpxchg{64,128} emulations Date: Mon, 13 Jan 2025 16:59:36 +0100 Message-ID: <20250113160017.17668-2-ubizjak@gmail.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20250113160017.17668-1-ubizjak@gmail.com> References: <20250113160017.17668-1-ubizjak@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: DC86B80035 X-Stat-Signature: f475qrt3if3tjg563skxauxmuiipq5q4 X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1736784028-834083 X-HE-Meta: U2FsdGVkX19as/AWqyRDl3i9ncK6WuDrEuqqf7uLq9d8yS5RayBndFnbMIiBZOZ7rtoUTavsFjut3FEKyPEbeCSJHUe6m05cUP6pcq0yPjJQe10TWEKkDCoVrwwXe5n2jg2y4y0nuwg2IPqofT3iDLe+8Nr/7yyAnHG7SooUa4JUUKUv5QJY+pqWS80iuzYqITVQ0Du4Qub/6gt2XEf3C05dwx+Q6BW2Fdpv//NUUBWw9511xOfRtg6YfRvFCAsWA01Tvj3dG4rDhkIRxoftjIoVzKdTr0nsHKfZxHne7Zy0RklRvj94meHINrTJc26NE+9MLtNWQs7jCupdBkUEFIrLF4YYNnE2WFlqilA9Glw6wLduLig+FVpQ47DncWhXiyIBxMEwdb9tcCRvxVZgvlZjVWPPFDgE6tTm1xmqrpl4dJBwppGvKVGg2NcvWlAOqjpG5f083irAPfYOZhWNaWnLqOIuLUcIVrAcHKpyZGzDKUtuPC2G3A07ZZ7Gw/zl1qT2zmYzf2v3xZhESL87V32AWt1gU6vDkjmYZ230mcEp8PgcTX+m5AsHZAHRCYSUehkIdzywqYCagFvSppFAtDys2sGVweB7v55PE7KI6YWfDr8xwW5kLW8oog/aWISgWgwxcc5LzFFJ6qzIcCzHZhcP6WkCT3m4zOiY/gUD7pgRkw5oCtZkFi4/ej3R61F2cZ33arVfX59a+Ey7U0sGnW0raM0Qyg+hUBRWBOTs0UGI1236g7+lNCiVweYVc3PSvTUN7gFdgOSFg9ctYa9LQk76zv+x4Z0pSg9emVaMHdVLsbNhTU8cRdY5z5J1N5snX0nW1uv/TnflYEb2VbzUgYI67LDZmrvriU6HaMUza7RkZ/zhFDbuRPV6YQ8ACuWSsvNrcgRxR78Tz75sVuQturOeA7lZt9WFhiD0t4SyQakBG4ou/R9sCJcj7ybkauQnHheURNkE7tEUE14TJz2 NPNWBxq9 B5vxI3a3FOK4ICtq/Oao/1DoScOY1hDhJb/fdeAsHbaF6zlWzF8uX2gV/z+SFeOopX4Wwgh+SH3j1K1RDtH+QW6PSOjmVyDQ2GHcw4kVrNzYN7OF0zeHcbpts3N8tSC5Sj4eBJK63+TAKyc3Gu1RWFkeezLASwEi7175kGKETjyaa4qxT+KYU2cKTFOm/G8keaUsvNRl/F4aXDQaKrJTUv5gIJ88HdYpVQNJLWYxU2MdQE4obbdcUFc8bA9FRMcKplcbXGrSqcp+JoeZibHUu97H2bVYOrXVrdmND+VTpQfjTtEETwgImhxd2IuLzjr9NCgg4+hX908RyFgDFjrUMIdR1LV/2unus4aJPyAR7agJwwB2qEiSXslovxGWpqmDnr5OAzHeJt8O53YSlAPrJj67Xxi6cHMuZkX9zmBVq/In2qYBVvp2S9AU0Q18HCGl8cRyy3Omt+bQIfO9SAJo3USI3MO64zNUshMmgaol3yVsDgqimjqchavs2xiGrRxVFHrq9rE2kTuf/61DqHuRHHn3pGGV6kUVru+/hp10xc7fxDiSAeetLDu8JL2QyoAmxhH27P9AotJSEryRRYyg0kG+yrQd1Ve7JIgZJQd2WlX+4FCjeCIqha1SvplBd6kKlIR3OB5qVXZ6IyEYchiXI68uQThJKkEncGIMwaWtiQuGqCw26QjqEl72CA46rmFkHRuEc X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: According to [1], the usage of asm pseudo directives in the asm template can confuse the compiler to wrongly estimate the size of the generated code. ALTERNATIVE macro expands to several asm pseudo directives, so its usage in {,try_}cmpxchg{64,128} causes instruction length estimate to fail by an order of magnitude (the compiler estimates the length of an asm to be more than 20 instructions). This wrong estimate further causes unoptimal inlining decisions for functions that use these locking primitives. Use asm_inline instead of just asm. For inlining purposes, the size of the asm is then taken as the minimum size, ignoring how many instructions compiler thinks it is. [1] https://gcc.gnu.org/onlinedocs/gcc/Size-of-an-asm.html Signed-off-by: Uros Bizjak Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Dennis Zhou Cc: Tejun Heo Cc: Christoph Lameter Cc: "Peter Zijlstra (Intel)" --- arch/x86/include/asm/cmpxchg_32.h | 32 +++++++------ arch/x86/include/asm/percpu.h | 77 +++++++++++++++---------------- 2 files changed, 55 insertions(+), 54 deletions(-) diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h index fd1282a783dd..95b5f990ca88 100644 --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -91,12 +91,14 @@ static __always_inline bool __try_cmpxchg64_local(volatile u64 *ptr, u64 *oldp, union __u64_halves o = { .full = (_old), }, \ n = { .full = (_new), }; \ \ - asm volatile(ALTERNATIVE(_lock_loc \ - "call cmpxchg8b_emu", \ - _lock "cmpxchg8b %a[ptr]", X86_FEATURE_CX8) \ - : ALT_OUTPUT_SP("+a" (o.low), "+d" (o.high)) \ - : "b" (n.low), "c" (n.high), [ptr] "S" (_ptr) \ - : "memory"); \ + asm_inline volatile( \ + ALTERNATIVE(_lock_loc \ + "call cmpxchg8b_emu", \ + _lock "cmpxchg8b %a[ptr]", X86_FEATURE_CX8) \ + : ALT_OUTPUT_SP("+a" (o.low), "+d" (o.high)) \ + : "b" (n.low), "c" (n.high), \ + [ptr] "S" (_ptr) \ + : "memory"); \ \ o.full; \ }) @@ -119,14 +121,16 @@ static __always_inline u64 arch_cmpxchg64_local(volatile u64 *ptr, u64 old, u64 n = { .full = (_new), }; \ bool ret; \ \ - asm volatile(ALTERNATIVE(_lock_loc \ - "call cmpxchg8b_emu", \ - _lock "cmpxchg8b %a[ptr]", X86_FEATURE_CX8) \ - CC_SET(e) \ - : ALT_OUTPUT_SP(CC_OUT(e) (ret), \ - "+a" (o.low), "+d" (o.high)) \ - : "b" (n.low), "c" (n.high), [ptr] "S" (_ptr) \ - : "memory"); \ + asm_inline volatile( \ + ALTERNATIVE(_lock_loc \ + "call cmpxchg8b_emu", \ + _lock "cmpxchg8b %a[ptr]", X86_FEATURE_CX8) \ + CC_SET(e) \ + : ALT_OUTPUT_SP(CC_OUT(e) (ret), \ + "+a" (o.low), "+d" (o.high)) \ + : "b" (n.low), "c" (n.high), \ + [ptr] "S" (_ptr) \ + : "memory"); \ \ if (unlikely(!ret)) \ *(_oldp) = o.full; \ diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index 0ab991fba7de..08f5f61690b7 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -348,15 +348,14 @@ do { \ old__.var = _oval; \ new__.var = _nval; \ \ - asm qual (ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \ - "cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \ - : ALT_OUTPUT_SP([var] "+m" (__my_cpu_var(_var)), \ - "+a" (old__.low), \ - "+d" (old__.high)) \ - : "b" (new__.low), \ - "c" (new__.high), \ - "S" (&(_var)) \ - : "memory"); \ + asm_inline qual ( \ + ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \ + "cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \ + : ALT_OUTPUT_SP([var] "+m" (__my_cpu_var(_var)), \ + "+a" (old__.low), "+d" (old__.high)) \ + : "b" (new__.low), "c" (new__.high), \ + "S" (&(_var)) \ + : "memory"); \ \ old__.var; \ }) @@ -378,17 +377,16 @@ do { \ old__.var = *_oval; \ new__.var = _nval; \ \ - asm qual (ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \ - "cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \ - CC_SET(z) \ - : ALT_OUTPUT_SP(CC_OUT(z) (success), \ - [var] "+m" (__my_cpu_var(_var)), \ - "+a" (old__.low), \ - "+d" (old__.high)) \ - : "b" (new__.low), \ - "c" (new__.high), \ - "S" (&(_var)) \ - : "memory"); \ + asm_inline qual ( \ + ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \ + "cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \ + CC_SET(z) \ + : ALT_OUTPUT_SP(CC_OUT(z) (success), \ + [var] "+m" (__my_cpu_var(_var)), \ + "+a" (old__.low), "+d" (old__.high)) \ + : "b" (new__.low), "c" (new__.high), \ + "S" (&(_var)) \ + : "memory"); \ if (unlikely(!success)) \ *_oval = old__.var; \ \ @@ -419,15 +417,14 @@ do { \ old__.var = _oval; \ new__.var = _nval; \ \ - asm qual (ALTERNATIVE("call this_cpu_cmpxchg16b_emu", \ - "cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \ - : ALT_OUTPUT_SP([var] "+m" (__my_cpu_var(_var)), \ - "+a" (old__.low), \ - "+d" (old__.high)) \ - : "b" (new__.low), \ - "c" (new__.high), \ - "S" (&(_var)) \ - : "memory"); \ + asm_inline qual ( \ + ALTERNATIVE("call this_cpu_cmpxchg16b_emu", \ + "cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \ + : ALT_OUTPUT_SP([var] "+m" (__my_cpu_var(_var)), \ + "+a" (old__.low), "+d" (old__.high)) \ + : "b" (new__.low), "c" (new__.high), \ + "S" (&(_var)) \ + : "memory"); \ \ old__.var; \ }) @@ -449,19 +446,19 @@ do { \ old__.var = *_oval; \ new__.var = _nval; \ \ - asm qual (ALTERNATIVE("call this_cpu_cmpxchg16b_emu", \ - "cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \ - CC_SET(z) \ - : ALT_OUTPUT_SP(CC_OUT(z) (success), \ - [var] "+m" (__my_cpu_var(_var)), \ - "+a" (old__.low), \ - "+d" (old__.high)) \ - : "b" (new__.low), \ - "c" (new__.high), \ - "S" (&(_var)) \ - : "memory"); \ + asm_inline qual ( \ + ALTERNATIVE("call this_cpu_cmpxchg16b_emu", \ + "cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \ + CC_SET(z) \ + : ALT_OUTPUT_SP(CC_OUT(z) (success), \ + [var] "+m" (__my_cpu_var(_var)), \ + "+a" (old__.low), "+d" (old__.high)) \ + : "b" (new__.low), "c" (new__.high), \ + "S" (&(_var)) \ + : "memory"); \ if (unlikely(!success)) \ *_oval = old__.var; \ + \ likely(success); \ })