From patchwork Thu Feb 13 19:14:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 13973937 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96D60C021A4 for ; Thu, 13 Feb 2025 19:15:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 169446B0083; Thu, 13 Feb 2025 14:15:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F37A6B0085; Thu, 13 Feb 2025 14:15:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E61416B0088; Thu, 13 Feb 2025 14:15:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id BD5046B0083 for ; Thu, 13 Feb 2025 14:15:16 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7418BB281A for ; Thu, 13 Feb 2025 19:15:16 +0000 (UTC) X-FDA: 83115874632.29.BF127DE Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) by imf04.hostedemail.com (Postfix) with ESMTP id 67C4040002 for ; Thu, 13 Feb 2025 19:15:14 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gciFYsAm; spf=pass (imf04.hostedemail.com: domain of ubizjak@gmail.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=ubizjak@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739474114; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9ll3EDDTY/6Dh7a8MWuMvGIgfSvEqbbqhDX1OHStQPk=; b=UQvvMl00p+7uyRCcKPp6xT+wlLtpMCWgLriBbuHSgKcNuMuy2vXP/oTKR5E8CaM1Zap8XN bOauHXFcwMaA4zBsdEvF0xdy8dXjcPSzoqy3446zVKdoYGPubCe4TU9mAEpXghnkKVXtxA T8834O8akJJxioQIEmauOoMfaRdzidA= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gciFYsAm; spf=pass (imf04.hostedemail.com: domain of ubizjak@gmail.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=ubizjak@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739474114; a=rsa-sha256; cv=none; b=sY93bTrZRq210sZZcD0RspixJwRaMFC7zBdYyqRGrxvzBViNnzW2yHikKaZ6V9Y78FTBvw 1QEyutumStBUQ0Dwt9QOdsRQgJ3zOiwQpaWMkF7RqoDKEUdmZodN/O5h8pfQ60L6ODQYig xYqnazBTzcdVhJs+fLoI0DKRIPXS5yQ= Received: by mail-ed1-f50.google.com with SMTP id 4fb4d7f45d1cf-5de74599749so1892583a12.1 for ; Thu, 13 Feb 2025 11:15:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739474113; x=1740078913; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9ll3EDDTY/6Dh7a8MWuMvGIgfSvEqbbqhDX1OHStQPk=; b=gciFYsAmRnuLOeBodv/sNIq73sgMBVvenWrVqGbLEID3nYENqVYS/OlXjK0OIAm2bJ 0iq6kiHmT2Ig4/h+E21zvDcTFcbUS8JXNrtb77oxVoLPo/NuMcq7+N2Om+jxcEv2W4BD 8hVu2BmTRNlhiE+cZfPFqlK84BQ2jNSsISuTQ6Jretwl5O9is4SMCzNW8Zedp4nvExc/ ksFFQY3Qee8zdM5IMNLHWIfbRy6d893KBLpyAZOrpR4RFro4N4RYyfebqHfWlBBMbHOW 2tb28aVRyB/B/7GPfQCCJSDHCJfA+xHfbZZZaB4slM8oLsgz2V8SkEyCY/9OQDCEvIbN P2hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739474113; x=1740078913; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9ll3EDDTY/6Dh7a8MWuMvGIgfSvEqbbqhDX1OHStQPk=; b=Qg+SVqOr+FYamyGGp5dag875XPk96XM2CgLs7g68t9TRGby+VcvuaRV/jlwa7NkeGm 9CrMd11+iSaoDccBfLUmtnoDcGIhvg6MBJCRLb7PVu4ve82bAze601LuIdUk4FDa8H2q brSpl3BiXBxZMOeHDGo7mPcWCaJvCmjGpG4z3rhCfBzfwfXn+tS4hPLSv+bOSWvN7ovK v5VflRGEoGOyPPhiOFLIdmnVfWeLMLnqqXbLg0EV/3GkjuaxWaFsJEpnjt5WBP0q6Or2 3D5Sxz4mb6hLSdmnPIEGMke/IhvQu8eYfiZK2GRRb/O7rkLEn7BkmjaUdbZjQlQyik5A GEXg== X-Forwarded-Encrypted: i=1; AJvYcCUEVtPWytx7AaRbN36hIM+Qfw2dPJR8TiGAKoANljhTenxuHr8e+uoZvh4CIQsRh/VABEMLF6yJNQ==@kvack.org X-Gm-Message-State: AOJu0YyUZ+T9E//B1qYSb4l/wJML943xeaB5lvgQ1IMVZZmX/nb01Nnw Yosn9HwTdEAQcZjNkekJ39Y4HWZPvbTKYLXyNHxPg7Xaj8GmvoPu X-Gm-Gg: ASbGncuTKQGmqzoevXV7UwoQfYe6z+jNJHY806adD2bgXMjyPiDZmORBV3sgm8IriCQ mbpy6apAMdffezdipALTutzbrUSxnQeUS+klpk7lHuagb5po0mHlONWpTjjyimJYR3cTSWldx/y e6otTg7OBPCnNbfmIYzmJQYJWXyiA3OWgFZEYylf1pxpPXfoVD6Pnd135kPXjAbVpyQEwcN0f5R sGcGrQmR6BhR7h3qx2iyik6hR+O4mbcYdgwx2lnfQgYK3CD1Ha46rlvbfpE2cQ7lo9rhqLJZBir 3Yca5nOW5N6xwfTQfGGUdeug X-Google-Smtp-Source: AGHT+IExUVZvjx/iFh1WkVP3rRpgO7ilttlIBKKBcGqGbRT5JPyBk2NmKS1r9fk+bJSkDogTrrzUaQ== X-Received: by 2002:a05:6402:5d1:b0:5d9:a62:32b with SMTP id 4fb4d7f45d1cf-5dec9d326b0mr4771387a12.7.1739474112708; Thu, 13 Feb 2025 11:15:12 -0800 (PST) Received: from localhost.localdomain ([46.248.82.114]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5dece288e2dsm1634508a12.80.2025.02.13.11.15.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Feb 2025 11:15:12 -0800 (PST) From: Uros Bizjak To: x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Uros Bizjak , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Dennis Zhou , Tejun Heo , Christoph Lameter , "Peter Zijlstra (Intel)" Subject: [PATCH RESEND 2/2] x86/locking: Use asm_inline for {,try_}cmpxchg{64,128} emulations Date: Thu, 13 Feb 2025 20:14:26 +0100 Message-ID: <20250213191457.12377-2-ubizjak@gmail.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20250213191457.12377-1-ubizjak@gmail.com> References: <20250213191457.12377-1-ubizjak@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 67C4040002 X-Stat-Signature: 4jnk7apb4h6cqjdppz4hshec4zdteogc X-HE-Tag: 1739474114-802395 X-HE-Meta: U2FsdGVkX18jWAP8BC2AwOjtSEiq0FKjpp0eo0Y2yCh0OD7i/wMjUa7qLaNDkjuU1085QuGTqBeYjVNwaNWG/esDTv9G0yDjv665753EJu8jULTn2KeBfngTD5hyCrWZBR0y+Das1/JRQWa5aZzTbbFAzUdt37gXscJ9BcwOO3WQ+/IoQM08+8dEtSbtG2VZXzZytMAcyhRznuOQVoGdgLqC7OIfnSvztqKyZ4I8CS7Nv56nGI3alKvpNxU258h4490rEHLkhi842+A2rwdyNaTpwzGt8KFatuMwWSKfcvJQHeqsI/StOalgpL0mjv0o74fiB1KSeTZZ4AT9Er6KmwQD7JaY88SDYgbazJ3A3zgp96UuoC7DHrZrPNdOuvq972rW32WPEpVPjJSKd0B8qHLbQ6tYHa516ZP5nNDOJFLv5wb+N5iGBnIaMAuEXgTY/QTEqp4CNvpiWjWGnGuxkDJbfU4qQZ4DabQ4qm+a7ZUDrcQexe97mKYpS5V2w+4Ivc+A2aexx4FNK1b0cfYqgnczR66sOJxAWhK37RkaK9T9IGO26F5vDAzFsPIGpBCCQhMfgKy0+RaXfW6G84INMiMjV+mpRA//v3nlrEggOISjcV2Q2o9oGamCtT6rJr87WYG5jakefl52SU9I6lYTL0sUxJ53Jb8Qu8S0gnya3bCAkxvbla/PAymPAom87xX4NgzPBnno6paBE9IB9Ji4r9gxmIusvf9JTJxyxJKzEbVXcgyakRjdrRD485Z+WDwimSra7JR0sSzKtMUmgSv7hBzBwKXopz2yT2OCj8g41JeWopan3feGoG4BIfQ+jVE3M/eVeXZDEPAHcjacq9Cyi/RIws6lyevxde7BP25myWM0N9n+ixtwwQHl9TX7OzN8JOyccsVOQtkvSdtptKwOPX4menAyfN7dLVpDL2q4v9qSU5PQQxZu66o+nFmPYL7jY6h5VOiJOVFbcUNNc09 jPDARPez 1oPHqFocEEawkwds32sg190t/4aKAjjDauqgaDBZZP6KWSA43x5c8NRVd5+GQIWOZzCR0KYExxVXGCvPwifxjnJ1GTpYgwS3TOP7/O6YAHemyjFdkMDrvZkcMV08PEljuft5EOMgSqAPHlmAKCc0JlE32ZyB8lN7HWowlh3py31tplpeTKbuUKyzkA4jwwoIDHWTj+HjDpDuNf2mHBCCbQCDP4e+QgvnlEiUwvzUO/fF+87slQrqyQc4q7MSJm25oLG9WQwUrZtdg7yxpRXRKyP5SBoI3Hkeq84yXvQY6ka9usfW9FQNrR/6qkNGX6jw4R65gL8m9bCJZPBXWqGN4MSvz4mVK5AaScVTFuDAIJV/AljJjjeg/hr0klcW/O6Kpa1FvPxoWyGgumKJ/PwKkxoNih/wQYpADp1Vrxp4jyGHLLWm44r3CGE38KLI+5tlbfE2iG2oQgMY2wuOk53vLEr+zwoK7fi3C8mPpdVmjFxpbBH2rvZ8vdoH9F1nlZcU3wBOo8NRjo0SnyGNdag/Gq+DyzCl/vXYhkrPCo5JZhGouOAENS47MyiWlSgod62oODnoGE1pepdswskS7uchjbfxDt8ZzZn7BzPp8DHA4mC8jB2rcZ2hwJDiqlkXfpY1nM4VRhZ4khXRw4rzRbUzBk2EXYPTPYUHSTK9fesNxR8m/9fuNxjBYOuvi3j7Kdk0oltgS X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: According to [1], the usage of asm pseudo directives in the asm template can confuse the compiler to wrongly estimate the size of the generated code. ALTERNATIVE macro expands to several asm pseudo directives, so its usage in {,try_}cmpxchg{64,128} causes instruction length estimate to fail by an order of magnitude (the compiler estimates the length of an asm to be more than 20 instructions). This wrong estimate further causes unoptimal inlining decisions for functions that use these locking primitives. Use asm_inline instead of just asm. For inlining purposes, the size of the asm is then taken as the minimum size, ignoring how many instructions compiler thinks it is. [1] https://gcc.gnu.org/onlinedocs/gcc/Size-of-an-asm.html Signed-off-by: Uros Bizjak Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Dennis Zhou Cc: Tejun Heo Cc: Christoph Lameter Cc: "Peter Zijlstra (Intel)" --- arch/x86/include/asm/cmpxchg_32.h | 32 +++++++------ arch/x86/include/asm/percpu.h | 77 +++++++++++++++---------------- 2 files changed, 55 insertions(+), 54 deletions(-) diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h index fd1282a783dd..95b5f990ca88 100644 --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -91,12 +91,14 @@ static __always_inline bool __try_cmpxchg64_local(volatile u64 *ptr, u64 *oldp, union __u64_halves o = { .full = (_old), }, \ n = { .full = (_new), }; \ \ - asm volatile(ALTERNATIVE(_lock_loc \ - "call cmpxchg8b_emu", \ - _lock "cmpxchg8b %a[ptr]", X86_FEATURE_CX8) \ - : ALT_OUTPUT_SP("+a" (o.low), "+d" (o.high)) \ - : "b" (n.low), "c" (n.high), [ptr] "S" (_ptr) \ - : "memory"); \ + asm_inline volatile( \ + ALTERNATIVE(_lock_loc \ + "call cmpxchg8b_emu", \ + _lock "cmpxchg8b %a[ptr]", X86_FEATURE_CX8) \ + : ALT_OUTPUT_SP("+a" (o.low), "+d" (o.high)) \ + : "b" (n.low), "c" (n.high), \ + [ptr] "S" (_ptr) \ + : "memory"); \ \ o.full; \ }) @@ -119,14 +121,16 @@ static __always_inline u64 arch_cmpxchg64_local(volatile u64 *ptr, u64 old, u64 n = { .full = (_new), }; \ bool ret; \ \ - asm volatile(ALTERNATIVE(_lock_loc \ - "call cmpxchg8b_emu", \ - _lock "cmpxchg8b %a[ptr]", X86_FEATURE_CX8) \ - CC_SET(e) \ - : ALT_OUTPUT_SP(CC_OUT(e) (ret), \ - "+a" (o.low), "+d" (o.high)) \ - : "b" (n.low), "c" (n.high), [ptr] "S" (_ptr) \ - : "memory"); \ + asm_inline volatile( \ + ALTERNATIVE(_lock_loc \ + "call cmpxchg8b_emu", \ + _lock "cmpxchg8b %a[ptr]", X86_FEATURE_CX8) \ + CC_SET(e) \ + : ALT_OUTPUT_SP(CC_OUT(e) (ret), \ + "+a" (o.low), "+d" (o.high)) \ + : "b" (n.low), "c" (n.high), \ + [ptr] "S" (_ptr) \ + : "memory"); \ \ if (unlikely(!ret)) \ *(_oldp) = o.full; \ diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index 0ab991fba7de..08f5f61690b7 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -348,15 +348,14 @@ do { \ old__.var = _oval; \ new__.var = _nval; \ \ - asm qual (ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \ - "cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \ - : ALT_OUTPUT_SP([var] "+m" (__my_cpu_var(_var)), \ - "+a" (old__.low), \ - "+d" (old__.high)) \ - : "b" (new__.low), \ - "c" (new__.high), \ - "S" (&(_var)) \ - : "memory"); \ + asm_inline qual ( \ + ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \ + "cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \ + : ALT_OUTPUT_SP([var] "+m" (__my_cpu_var(_var)), \ + "+a" (old__.low), "+d" (old__.high)) \ + : "b" (new__.low), "c" (new__.high), \ + "S" (&(_var)) \ + : "memory"); \ \ old__.var; \ }) @@ -378,17 +377,16 @@ do { \ old__.var = *_oval; \ new__.var = _nval; \ \ - asm qual (ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \ - "cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \ - CC_SET(z) \ - : ALT_OUTPUT_SP(CC_OUT(z) (success), \ - [var] "+m" (__my_cpu_var(_var)), \ - "+a" (old__.low), \ - "+d" (old__.high)) \ - : "b" (new__.low), \ - "c" (new__.high), \ - "S" (&(_var)) \ - : "memory"); \ + asm_inline qual ( \ + ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \ + "cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \ + CC_SET(z) \ + : ALT_OUTPUT_SP(CC_OUT(z) (success), \ + [var] "+m" (__my_cpu_var(_var)), \ + "+a" (old__.low), "+d" (old__.high)) \ + : "b" (new__.low), "c" (new__.high), \ + "S" (&(_var)) \ + : "memory"); \ if (unlikely(!success)) \ *_oval = old__.var; \ \ @@ -419,15 +417,14 @@ do { \ old__.var = _oval; \ new__.var = _nval; \ \ - asm qual (ALTERNATIVE("call this_cpu_cmpxchg16b_emu", \ - "cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \ - : ALT_OUTPUT_SP([var] "+m" (__my_cpu_var(_var)), \ - "+a" (old__.low), \ - "+d" (old__.high)) \ - : "b" (new__.low), \ - "c" (new__.high), \ - "S" (&(_var)) \ - : "memory"); \ + asm_inline qual ( \ + ALTERNATIVE("call this_cpu_cmpxchg16b_emu", \ + "cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \ + : ALT_OUTPUT_SP([var] "+m" (__my_cpu_var(_var)), \ + "+a" (old__.low), "+d" (old__.high)) \ + : "b" (new__.low), "c" (new__.high), \ + "S" (&(_var)) \ + : "memory"); \ \ old__.var; \ }) @@ -449,19 +446,19 @@ do { \ old__.var = *_oval; \ new__.var = _nval; \ \ - asm qual (ALTERNATIVE("call this_cpu_cmpxchg16b_emu", \ - "cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \ - CC_SET(z) \ - : ALT_OUTPUT_SP(CC_OUT(z) (success), \ - [var] "+m" (__my_cpu_var(_var)), \ - "+a" (old__.low), \ - "+d" (old__.high)) \ - : "b" (new__.low), \ - "c" (new__.high), \ - "S" (&(_var)) \ - : "memory"); \ + asm_inline qual ( \ + ALTERNATIVE("call this_cpu_cmpxchg16b_emu", \ + "cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \ + CC_SET(z) \ + : ALT_OUTPUT_SP(CC_OUT(z) (success), \ + [var] "+m" (__my_cpu_var(_var)), \ + "+a" (old__.low), "+d" (old__.high)) \ + : "b" (new__.low), "c" (new__.high), \ + "S" (&(_var)) \ + : "memory"); \ if (unlikely(!success)) \ *_oval = old__.var; \ + \ likely(success); \ })