From patchwork Wed Mar 6 16:59:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Puranjay Mohan X-Patchwork-Id: 13584292 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 29E92C54E41 for ; Wed, 6 Mar 2024 16:59:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-Id:Date:Subject:To :From:Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=Ur81UdWVSTg9ZyAavv8pNcehM9gZNmM8F3GLfCPpxc0=; b=3wAhtHRdla4WEA q2PQGUNiKUz7l/JsyamwMEKFjU/ueHHLoyyX3S2M1XPQJdvNUj0E4PQtMddh7xRs6zzFKy1vni1on K21L+gukIIfl7EDM8XRrxo6Kj6TfA1ThtTER6++b4IU1Vri0rLRAhmY/QCMRiN3wg8jhISRkbAKrf KabuJ/RRQQVjuMmb6zcGTbfKqIiz1O4jUQyi394awTchCCIk2asDpYBXycQVDgHM4uwgAjlquTi3E 4873f9KI8UyI7tvgNxf+S9QBZUdEUL342ru5q7ateLJbsrrtMGyDKzuyOtL9SfDIpgY0NunbM+R8/ yMu5efdIRXlqxn9cvR2A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rhuc5-000000015hz-2bEs; Wed, 06 Mar 2024 16:59:29 +0000 Received: from mail-wr1-x42c.google.com ([2a00:1450:4864:20::42c]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rhuc0-000000015fb-35sH for linux-riscv@lists.infradead.org; Wed, 06 Mar 2024 16:59:27 +0000 Received: by mail-wr1-x42c.google.com with SMTP id ffacd0b85a97d-33e4d36f288so748705f8f.1 for ; Wed, 06 Mar 2024 08:59:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709744361; x=1710349161; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=GoDKhwgw+miW5ARkGVrNqzMnxuRe2Uy6wo1ivXGP/JA=; b=kx93gwSnXK/TR+rk44o2qQEcGTgDwn1TcvSKplUrb//BH/u+aCf7dru3NgmdUHxTf8 LyVsmr9UgFJfQKsRj0TbJN/OMb6h2GetmrhfXF46tFL9WMEBbAF8c9SU7LlufFjBbIut 4WNvZ1W1w0rkclCqeWbqdAfdRKTz3yNVJmQZzOxYdNVj7Qx/DzRDn3RBd6I3YAJX+tzi 60GUsIkKEzoIt/c4VVQM0wQpe6rwVgljp785H+gaf324XVy4dggfUOfQdLxDB41NzsEm btQ3gg1/oK+GMOpp3mOV26fVRSZcU1kvoUffaxcLXX+hF6YsgPJWRWIdodlZFI1ZSBra /pcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709744361; x=1710349161; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GoDKhwgw+miW5ARkGVrNqzMnxuRe2Uy6wo1ivXGP/JA=; b=VnEs3NDB3qoYMZ/hi/Vha4y2sx5yp5A9CkQwNhd9wDqx+qMyE1xSqmKxciEN1L02St GzqriIZKgWSehIcfKynQ5FXh8F1/3LY82w9LyGlXTkolr15Qf7JTlZfPw1IHNd0fd6d+ hkFgokmrFw31UlHAklWYal59DBOzfXBobfc4TdTbxavWYc5FmwVpyyHSaBSIDO563Gdn dkrqRzfuM8FsoHsjhlfG4GMFMTNzt+FQB9zmrSrnCyY71rjjuJ1H4LZ63/cvv8ZQPa6X wYCMk7YgVM1SSuz3+wUNtfLiLtJ6Iuvxfv4BXZkcZhaMwax03Txp7uEExlizkeardA3c coyg== X-Forwarded-Encrypted: i=1; AJvYcCUdU2NqzF1+9TEwum0jnlaEQFocZb02bBM87rcPN2XjA8l0a8pOM3KfSmry2lyrlMJBz4vJ3ueI48ytEFmQ+kWV1YXtHdDJELA2HgpJPVxR X-Gm-Message-State: AOJu0Yy4L7Xy8ehZCywgqzx8vthB3BvNJBnqX0d/ZXXHzqj9/tCCS0EX tXtLuCtuACSMaF7d8xS2+1FxcVy+ktn6xFG19KUY1hplHAIDS/W6 X-Google-Smtp-Source: AGHT+IHKyDBT04YlSczEFz9XpHogy8dsgMG3n5l3SPmhEJfqIvqI+bjBU0cnm2x+43p7VFNEeJFmXA== X-Received: by 2002:adf:ca04:0:b0:33d:c45:15ec with SMTP id o4-20020adfca04000000b0033d0c4515ecmr5955057wrh.32.1709744361226; Wed, 06 Mar 2024 08:59:21 -0800 (PST) Received: from localhost (54-240-197-231.amazon.com. [54.240.197.231]) by smtp.gmail.com with ESMTPSA id ay30-20020a05600c1e1e00b00412fc25691bsm1294908wmb.38.2024.03.06.08.59.20 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 Mar 2024 08:59:20 -0800 (PST) From: Puranjay Mohan To: Paul Walmsley , Palmer Dabbelt , Albert Ou , Steven Rostedt , Masami Hiramatsu , Mark Rutland , Sami Tolvanen , Guo Ren , Puranjay Mohan , Ley Foon Tan , Deepak Gupta , Sia Jee Heng , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Song Shuai , =?utf-8?b?Q2zDqW1lbnQgTMOpZ2Vy?= , Al Viro , Jisheng Zhang , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Subject: [RFC PATCH] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS Date: Wed, 6 Mar 2024 16:59:04 +0000 Message-Id: <20240306165904.108141-1-puranjay12@gmail.com> X-Mailer: git-send-email 2.40.1 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240306_085924_858732_7171DF5A X-CRM114-Status: GOOD ( 28.78 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V. This allows each ftrace callsite to provide an ftrace_ops to the common ftrace trampoline, allowing each callsite to invoke distinct tracer functions without the need to fall back to list processing or to allocate custom trampolines for each callsite. This significantly speeds up cases where multiple distinct trace functions are used and callsites are mostly traced by a single tracer. The idea and most of the implementation is taken from the ARM64's implementation of the same feature. The idea is to place a pointer to the ftrace_ops as a literal at a fixed offset from the function entry point, which can be recovered by the common ftrace trampoline. We use -fpatchable-function-entry to reserve 8 bytes above the function entry by emitting 2 4 byte or 4 2 byte nops depending on the presence of CONFIG_RISCV_ISA_C. These 8 bytes are patched at runtime with a pointer to the associated ftrace_ops for that callsite. Functions are aligned to 8 bytes to make sure that the accesses to this literal are atomic. This approach allows for directly invoking ftrace_ops::func even for ftrace_ops which are dynamically-allocated (or part of a module), without going via ftrace_ops_list_func. I've benchamrked this with the ftrace_ops sample module on Qemu, with the next version, I will provide benchmarks on real hardware: Without this patch: +-----------------------+-----------------+----------------------------+ | Number of tracers | Total time (ns) | Per-call average time | |-----------------------+-----------------+----------------------------| | Relevant | Irrelevant | 100000 calls | Total (ns) | Overhead (ns) | |----------+------------+-----------------+------------+---------------| | 0 | 0 | 15615700 | 156 | - | | 0 | 1 | 15917600 | 159 | - | | 0 | 2 | 15668000 | 156 | - | | 0 | 10 | 14971500 | 149 | - | | 0 | 100 | 15417600 | 154 | - | | 0 | 200 | 15387000 | 153 | - | |----------+------------+-----------------+------------+---------------| | 1 | 0 | 119906800 | 1199 | 1043 | | 1 | 1 | 137428600 | 1374 | 1218 | | 1 | 2 | 159562400 | 1374 | 1218 | | 1 | 10 | 302099900 | 3020 | 2864 | | 1 | 100 | 2008785500 | 20087 | 19931 | | 1 | 200 | 3965221900 | 39652 | 39496 | |----------+------------+-----------------+------------+---------------| | 1 | 0 | 119166700 | 1191 | 1035 | | 2 | 0 | 157999900 | 1579 | 1423 | | 10 | 0 | 425370100 | 4253 | 4097 | | 100 | 0 | 3595252100 | 35952 | 35796 | | 200 | 0 | 7023485700 | 70234 | 70078 | +----------+------------+-----------------+------------+---------------+ Note: per-call overhead is estimated relative to the baseline case with 0 relevant tracers and 0 irrelevant tracers. With this patch: +-----------------------+-----------------+----------------------------+ | Number of tracers | Total time (ns) | Per-call average time | |-----------------------+-----------------+----------------------------| | Relevant | Irrelevant | 100000 calls | Total (ns) | Overhead (ns) | |----------+------------+-----------------+------------+---------------| | 0 | 0 | 15254600 | 152 | - | | 0 | 1 | 16136700 | 161 | - | | 0 | 2 | 15329500 | 153 | - | | 0 | 10 | 15148800 | 151 | - | | 0 | 100 | 15746900 | 157 | - | | 0 | 200 | 15737400 | 157 | - | |----------+------------+-----------------+------------+---------------| | 1 | 0 | 47909000 | 479 | 327 | | 1 | 1 | 48297400 | 482 | 330 | | 1 | 2 | 47314100 | 473 | 321 | | 1 | 10 | 47844900 | 478 | 326 | | 1 | 100 | 46591900 | 465 | 313 | | 1 | 200 | 47178900 | 471 | 319 | |----------+------------+-----------------+------------+---------------| | 1 | 0 | 46715800 | 467 | 315 | | 2 | 0 | 155134500 | 1551 | 1399 | | 10 | 0 | 442672800 | 4426 | 4274 | | 100 | 0 | 4092353900 | 40923 | 40771 | | 200 | 0 | 7135796400 | 71357 | 71205 | +----------+------------+-----------------+------------+---------------+ Note: per-call overhead is estimated relative to the baseline case with 0 relevant tracers and 0 irrelevant tracers. As can be seen from the above: a) Whenever there is a single relevant tracer function associated with a tracee, the overhead of invoking the tracer is constant, and does not scale with the number of tracers which are *not* associated with that tracee. b) The overhead for a single relevant tracer has dropped to ~1/3 of the overhead prior to this series (from 1035ns to 315ns). This is largely due to permitting calls to dynamically-allocated ftrace_ops without going through ftrace_ops_list_func. Why is this patch a RFC patch: 1. I saw some rcu stalls on Qemu and need to debug them and see if they were introduced by this patch. 2. This needs to be tested thoroughly on real hardware. 3. Seeking reviews to fix any fundamental problems with this patch that I might have missed due to my lack of RISC-V architecture knowledge. 4. I would like to benchmark this on real hardware and put the results in the commit message. Signed-off-by: Puranjay Mohan --- arch/riscv/Kconfig | 2 ++ arch/riscv/Makefile | 8 +++++ arch/riscv/include/asm/ftrace.h | 3 ++ arch/riscv/kernel/asm-offsets.c | 3 ++ arch/riscv/kernel/ftrace.c | 59 +++++++++++++++++++++++++++++++++ arch/riscv/kernel/mcount-dyn.S | 42 ++++++++++++++++++++--- 6 files changed, 112 insertions(+), 5 deletions(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 0bfcfec67ed5..e474742e23b2 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -78,6 +78,7 @@ config RISCV select EDAC_SUPPORT select FRAME_POINTER if PERF_EVENTS || (FUNCTION_TRACER && !DYNAMIC_FTRACE) select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY if DYNAMIC_FTRACE + select FUNCTION_ALIGNMENT_8B if DYNAMIC_FTRACE_WITH_CALL_OPS select GENERIC_ARCH_TOPOLOGY select GENERIC_ATOMIC64 if !64BIT select GENERIC_CLOCKEVENTS_BROADCAST if SMP @@ -127,6 +128,7 @@ config RISCV select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && MMU && (CLANG_SUPPORTS_DYNAMIC_FTRACE || GCC_SUPPORTS_DYNAMIC_FTRACE) select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE + select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS if (DYNAMIC_FTRACE_WITH_REGS && !CFI_CLANG) select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL select HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_GRAPH_RETVAL if HAVE_FUNCTION_GRAPH_TRACER diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile index 252d63942f34..875ad5dc3d32 100644 --- a/arch/riscv/Makefile +++ b/arch/riscv/Makefile @@ -14,12 +14,20 @@ endif ifeq ($(CONFIG_DYNAMIC_FTRACE),y) LDFLAGS_vmlinux += --no-relax KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY +ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS), y) +ifeq ($(CONFIG_RISCV_ISA_C),y) + CC_FLAGS_FTRACE := -fpatchable-function-entry=8,4 +else + CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2 +endif +else ifeq ($(CONFIG_RISCV_ISA_C),y) CC_FLAGS_FTRACE := -fpatchable-function-entry=4 else CC_FLAGS_FTRACE := -fpatchable-function-entry=2 endif endif +endif ifeq ($(CONFIG_CMODEL_MEDLOW),y) KBUILD_CFLAGS_MODULE += -mcmodel=medany diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h index 329172122952..c9a84222c9ea 100644 --- a/arch/riscv/include/asm/ftrace.h +++ b/arch/riscv/include/asm/ftrace.h @@ -28,6 +28,9 @@ void MCOUNT_NAME(void); static inline unsigned long ftrace_call_adjust(unsigned long addr) { + if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS)) + return addr + 8; + return addr; } diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c index a03129f40c46..7d7c4b486852 100644 --- a/arch/riscv/kernel/asm-offsets.c +++ b/arch/riscv/kernel/asm-offsets.c @@ -488,4 +488,7 @@ void asm_offsets(void) DEFINE(STACKFRAME_SIZE_ON_STACK, ALIGN(sizeof(struct stackframe), STACK_ALIGN)); OFFSET(STACKFRAME_FP, stackframe, fp); OFFSET(STACKFRAME_RA, stackframe, ra); +#ifdef CONFIG_FUNCTION_TRACER + DEFINE(FTRACE_OPS_FUNC, offsetof(struct ftrace_ops, func)); +#endif } diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c index f5aa24d9e1c1..e2e75e15d32e 100644 --- a/arch/riscv/kernel/ftrace.c +++ b/arch/riscv/kernel/ftrace.c @@ -82,9 +82,52 @@ static int __ftrace_modify_call(unsigned long hook_pos, unsigned long target, return 0; } +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS +static const struct ftrace_ops *riscv64_rec_get_ops(struct dyn_ftrace *rec) +{ + const struct ftrace_ops *ops = NULL; + + if (rec->flags & FTRACE_FL_CALL_OPS_EN) { + ops = ftrace_find_unique_ops(rec); + WARN_ON_ONCE(!ops); + } + + if (!ops) + ops = &ftrace_list_ops; + + return ops; +} + +static int ftrace_rec_set_ops(const struct dyn_ftrace *rec, + const struct ftrace_ops *ops) +{ + unsigned long literal = rec->ip - 8; + + return patch_text_nosync((void *)literal, &ops, sizeof(ops)); +} + +static int ftrace_rec_set_nop_ops(struct dyn_ftrace *rec) +{ + return ftrace_rec_set_ops(rec, &ftrace_nop_ops); +} + +static int ftrace_rec_update_ops(struct dyn_ftrace *rec) +{ + return ftrace_rec_set_ops(rec, riscv64_rec_get_ops(rec)); +} +#else +static int ftrace_rec_set_nop_ops(struct dyn_ftrace *rec) { return 0; } +static int ftrace_rec_update_ops(struct dyn_ftrace *rec) { return 0; } +#endif + int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) { unsigned int call[2]; + int ret; + + ret = ftrace_rec_update_ops(rec); + if (ret) + return ret; make_call_t0(rec->ip, addr, call); @@ -98,6 +141,11 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, unsigned long addr) { unsigned int nops[2] = {NOP4, NOP4}; + int ret; + + ret = ftrace_rec_set_nop_ops(rec); + if (ret) + return ret; if (patch_text_nosync((void *)rec->ip, nops, MCOUNT_INSN_SIZE)) return -EPERM; @@ -125,6 +173,13 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) int ftrace_update_ftrace_func(ftrace_func_t func) { + /* + * When using CALL_OPS, the function to call is associated with the + * call site, and we don't have a global function pointer to update. + */ + if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS)) + return 0; + int ret = __ftrace_modify_call((unsigned long)&ftrace_call, (unsigned long)func, true, true); if (!ret) { @@ -147,6 +202,10 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr, make_call_t0(caller, old_addr, call); ret = ftrace_check_current_call(caller, call); + if (ret) + return ret; + + ret = ftrace_rec_update_ops(rec); if (ret) return ret; diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S index b7561288e8da..cb241e36e514 100644 --- a/arch/riscv/kernel/mcount-dyn.S +++ b/arch/riscv/kernel/mcount-dyn.S @@ -191,11 +191,35 @@ .endm .macro PREPARE_ARGS - addi a0, t0, -FENTRY_RA_OFFSET + addi a0, t0, -FENTRY_RA_OFFSET // ip (callsite's auipc insn) +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS + /* + * When CALL_OPS is enabled (2 or 4) nops [8B] are placed before the + * function entry, these are later overwritten with the pointer to the + * associated struct ftrace_ops. + * + * -8: &ftrace_ops of the associated tracer function. + *: + * 0: auipc t0/ra, 0x? + * 4: jalr t0/ra, ?(t0/ra) + * + * -8: &ftrace_nop_ops + *: + * 0: nop + * 4: nop + * + * t0 is set to ip+8 after the jalr is executed at the callsite, + * so we find the associated op at t0-16. + */ + mv a1, ra // parent_ip + REG_L a2, -16(t0) // op + REG_L ra, FTRACE_OPS_FUNC(a2) // op->func +#else la a1, function_trace_op - REG_L a2, 0(a1) - mv a1, ra - mv a3, sp + REG_L a2, 0(a1) // op + mv a1, ra // parent_ip +#endif + mv a3, sp // regs .endm #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */ @@ -233,8 +257,12 @@ SYM_FUNC_START(ftrace_regs_caller) SAVE_ABI_REGS 1 PREPARE_ARGS +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS + jalr ra +#else SYM_INNER_LABEL(ftrace_regs_call, SYM_L_GLOBAL) call ftrace_stub +#endif RESTORE_ABI_REGS 1 bnez t1, .Ldirect @@ -247,9 +275,13 @@ SYM_FUNC_START(ftrace_caller) SAVE_ABI_REGS 0 PREPARE_ARGS -SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL) +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS + jalr ra +#else +SYM_INNER_LABEL(ftrace_regs_call, SYM_L_GLOBAL) call ftrace_stub +#endif RESTORE_ABI_REGS 0 jr t0 SYM_FUNC_END(ftrace_caller)