From patchwork Thu Oct 12 03:50:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418208 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7115F812; Thu, 12 Oct 2023 03:51:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ALjs+XS1" Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F7C1B6; Wed, 11 Oct 2023 20:51:16 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-692ada71d79so452814b3a.1; Wed, 11 Oct 2023 20:51:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082676; x=1697687476; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=W8mPnOK6VoW5tNOXDdi1hzajrHQSpWzCPNZ87J7UJ64=; b=ALjs+XS1Kz0sClTobra4U5PV9UrE2IjMPhahd0cfd/HqJjnZSh2FrSHEACnsc9t7yn +1cHh/uOvGTXvNJQ5kv743BlQd/ppgcfxxwAIiT0fr/yeAEDcdqzwmy9YgTMqqLcwbo7 +AesIL8qFbkndAGJxS+mtluL2hz/LAIBv5jbgyTdbkyEh5eDNmkQ84DFUpEPHn+b5UQq HJkIneGx7UcWxAl7bsRZE+JZigoGyKK/2nThHEb0Nkn3MTG8Npq1dVa2pwJsvKmvQDNa t/M66NhOUFjJosS7QC9ndSe9hzC4JN9sZYbPN0aVIiG34+v9GgE6NfYhSn5ymiXdtEWX NLOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082676; x=1697687476; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=W8mPnOK6VoW5tNOXDdi1hzajrHQSpWzCPNZ87J7UJ64=; b=DjeVDtDeg0Cc5s6WaAKKSbQ7l8tColYEWfWjY8G7ISQQgRWOGLpCEIX6nFCFZP+hhy AqbwcVbkYYmH+abxDh5puF8ILYKcOkcPg2GBssCgH9nlEaksm4tr9SAQ2+a6/7ljsqVJ NH0IgIum7ub5NBZhCCRBVr9xBKYSHkznAcdcWBaOOx3w3MNxYzQ3hfQo7NsHxisk8Wx1 qKoKkR2Fb5mM/X5JQcL5A6Z5imqdPpjIG9gKaWjKW66UOnMqOQ4L2lBeDFveE7YSnALv 75LJK1qJxOoXcqkEuEOkLvV0DDtaz/9Of/ft6JCPyf7RU+85afhUwGQUvNjx7InH0XG/ R0YA== X-Gm-Message-State: AOJu0YyIn0fgy+KJqoEZn6ZjiivTsGfOV+Ud4Dj930J853SdNeK/1bvT KVFEs0O3YS91PUGquHKauhQ= X-Google-Smtp-Source: AGHT+IEtXTilcE+YdrmkNnQSEjjz2LKgJCM4syB+3wA5/leydBOIEJn2PbGBohrDIFs6szMMzXEObA== X-Received: by 2002:a05:6a20:1047:b0:16b:79b3:2b43 with SMTP id gt7-20020a056a20104700b0016b79b32b43mr13926984pzc.37.1697082675621; Wed, 11 Oct 2023 20:51:15 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:15 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org, Huacai Chen , WANG Rui Subject: [PATCH 01/48] perf annotate: Move raw_comment and raw_func_start Date: Wed, 11 Oct 2023 20:50:24 -0700 Message-ID: <20231012035111.676789-2-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Thoese two fields are used only for the jump_ops, so move them into the union to save some bytes. Also add jump__delete() callback not to free the fields as they didn't allocate new strings. Cc: Huacai Chen Cc: WANG Rui Signed-off-by: Namhyung Kim --- .../perf/arch/loongarch/annotate/instructions.c | 6 +++--- tools/perf/util/annotate.c | 17 +++++++++++++---- tools/perf/util/annotate.h | 6 ++++-- 3 files changed, 20 insertions(+), 9 deletions(-) diff --git a/tools/perf/arch/loongarch/annotate/instructions.c b/tools/perf/arch/loongarch/annotate/instructions.c index 98e19c5366ac..21cc7e4149f7 100644 --- a/tools/perf/arch/loongarch/annotate/instructions.c +++ b/tools/perf/arch/loongarch/annotate/instructions.c @@ -61,10 +61,10 @@ static int loongarch_jump__parse(struct arch *arch, struct ins_operands *ops, st const char *c = strchr(ops->raw, '#'); u64 start, end; - ops->raw_comment = strchr(ops->raw, arch->objdump.comment_char); - ops->raw_func_start = strchr(ops->raw, '<'); + ops->jump.raw_comment = strchr(ops->raw, arch->objdump.comment_char); + ops->jump.raw_func_start = strchr(ops->raw, '<'); - if (ops->raw_func_start && c > ops->raw_func_start) + if (ops->jump.raw_func_start && c > ops->jump.raw_func_start) c = NULL; if (c++ != NULL) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 82956adf9963..211636e65b03 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -340,10 +340,10 @@ bool ins__is_call(const struct ins *ins) */ static inline const char *validate_comma(const char *c, struct ins_operands *ops) { - if (ops->raw_comment && c > ops->raw_comment) + if (ops->jump.raw_comment && c > ops->jump.raw_comment) return NULL; - if (ops->raw_func_start && c > ops->raw_func_start) + if (ops->jump.raw_func_start && c > ops->jump.raw_func_start) return NULL; return c; @@ -359,8 +359,8 @@ static int jump__parse(struct arch *arch, struct ins_operands *ops, struct map_s const char *c = strchr(ops->raw, ','); u64 start, end; - ops->raw_comment = strchr(ops->raw, arch->objdump.comment_char); - ops->raw_func_start = strchr(ops->raw, '<'); + ops->jump.raw_comment = strchr(ops->raw, arch->objdump.comment_char); + ops->jump.raw_func_start = strchr(ops->raw, '<'); c = validate_comma(c, ops); @@ -462,7 +462,16 @@ static int jump__scnprintf(struct ins *ins, char *bf, size_t size, ops->target.offset); } +static void jump__delete(struct ins_operands *ops __maybe_unused) +{ + /* + * The ops->jump.raw_comment and ops->jump.raw_func_start belong to the + * raw string, don't free them. + */ +} + static struct ins_ops jump_ops = { + .free = jump__delete, .parse = jump__parse, .scnprintf = jump__scnprintf, }; diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 962780559176..9d8b4199e3bd 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -31,8 +31,6 @@ struct ins { struct ins_operands { char *raw; - char *raw_comment; - char *raw_func_start; struct { char *raw; char *name; @@ -52,6 +50,10 @@ struct ins_operands { struct ins ins; struct ins_operands *ops; } locked; + struct { + char *raw_comment; + char *raw_func_start; + } jump; }; }; From patchwork Thu Oct 12 03:50:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418210 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D67981F; Thu, 12 Oct 2023 03:51:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="eaZQg1tL" Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0D6CB7; Wed, 11 Oct 2023 20:51:17 -0700 (PDT) Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-1c735473d1aso4395815ad.1; Wed, 11 Oct 2023 20:51:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082677; x=1697687477; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=Ed07eY8Yy1Ov3wLjexONW1pCgxoZtQGuV9y71hqZcBM=; b=eaZQg1tL7TBIGDgwpNe8+dcw2HCPBAaznYJA1hsKS+cXUDd/Cdu8pe0k35EHS+z/GW OnYFYqNTcQl7lWg9UAAsQ1GDEmLM1vNXN5obwYli2LQ24mtF2ADZW3s5zfuDKNKHsLAw G1lwmAEYZbXkIC6ZSTpBLUxExgvt/ZQ9OaB6kvh5wiRBbp85SfT9nISq8BrI7YIpkvTX Pf2iJb3/dr3MicHRo6d0au9K1mlCQDmKZaPyXBDzxILIRUwXbCO3q6juVxWdWhbpL/i8 isy6USE5TLx+ipQvdBUbZsxYi2yRVBUXtG2Z9RPjV/Unvi2w+qtG3IlHt4xVfPv8FJOD 9GYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082677; x=1697687477; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Ed07eY8Yy1Ov3wLjexONW1pCgxoZtQGuV9y71hqZcBM=; b=KGYO4m2wJQi85fbgDyhiIX+yTKjzXstzQlgGLb+mV/LGSW6fBOWuZga4/4w7Z25dWf 6mIe812yQT230nLHBW+BbtHEWVkV1PK0kudaqisP3pYtXBqgNxDOmZ9tTbkQVK/lhxPa K0BXtdoe+olJVOF88I24s5Ocd9lXawtSdyDhtMLV/ct46mMhOSKI8fKgEOdrkDdnAtU2 0YGi7XaOGbyF3d1Rm61giC6OOc3qx8XG+d+EmS/kqh63ncpox6kutHsQNPmfFxyiFd0J kpnrOsxZWve+YVaerXBjMr7mqgkLfo7aWh8txEwdDNb67LKNYl9xo9LGkwI/9913xYSz mEjA== X-Gm-Message-State: AOJu0YwocfeyHMAMXN/Cm4z6zsLvsFSDZtnmJauprvgcgzUjxgp9/URc F2+QBo5RMG0vSoR+Slg7XsE= X-Google-Smtp-Source: AGHT+IFmLdZC51KV1oFgo4VqJTLi7sLS0FScxil5PxiMM501u2XpxMgMw0RbKIQ7x+rJGvM/hyHHzQ== X-Received: by 2002:a17:902:e9c5:b0:1c9:bcbe:f581 with SMTP id 5-20020a170902e9c500b001c9bcbef581mr6002537plk.50.1697082677020; Wed, 11 Oct 2023 20:51:17 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:16 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 02/48] perf annotate: Check if operand has multiple regs Date: Wed, 11 Oct 2023 20:50:25 -0700 Message-ID: <20231012035111.676789-3-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net It needs to check all possible information in an instruction. Let's add a field indicating if the operand has multiple registers. I'll be used to search type information like in an array access on x86 like: mov 0x10(%rax,%rbx,8), %rcx ------------- here Signed-off-by: Namhyung Kim --- tools/perf/util/annotate.c | 36 ++++++++++++++++++++++++++++++++++++ tools/perf/util/annotate.h | 2 ++ 2 files changed, 38 insertions(+) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 211636e65b03..605298410ed4 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -85,6 +85,8 @@ struct arch { struct { char comment_char; char skip_functions_char; + char register_char; + char memory_ref_char; } objdump; }; @@ -188,6 +190,8 @@ static struct arch architectures[] = { .insn_suffix = "bwlq", .objdump = { .comment_char = '#', + .register_char = '%', + .memory_ref_char = '(', }, }, { @@ -566,6 +570,34 @@ static struct ins_ops lock_ops = { .scnprintf = lock__scnprintf, }; +/* + * Check if the operand has more than one registers like x86 SIB addressing: + * 0x1234(%rax, %rbx, 8) + * + * But it doesn't care segment selectors like %gs:0x5678(%rcx), so just check + * the input string after 'memory_ref_char' if exists. + */ +static bool check_multi_regs(struct arch *arch, const char *op) +{ + int count = 0; + + if (arch->objdump.register_char == 0) + return false; + + if (arch->objdump.memory_ref_char) { + op = strchr(op, arch->objdump.memory_ref_char); + if (op == NULL) + return false; + } + + while ((op = strchr(op, arch->objdump.register_char)) != NULL) { + count++; + op++; + } + + return count > 1; +} + static int mov__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms __maybe_unused) { char *s = strchr(ops->raw, ','), *target, *comment, prev; @@ -593,6 +625,8 @@ static int mov__parse(struct arch *arch, struct ins_operands *ops, struct map_sy if (ops->source.raw == NULL) return -1; + ops->source.multi_regs = check_multi_regs(arch, ops->source.raw); + target = skip_spaces(++s); comment = strchr(s, arch->objdump.comment_char); @@ -613,6 +647,8 @@ static int mov__parse(struct arch *arch, struct ins_operands *ops, struct map_sy if (ops->target.raw == NULL) goto out_free_source; + ops->target.multi_regs = check_multi_regs(arch, ops->target.raw); + if (comment == NULL) return 0; diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 9d8b4199e3bd..e33a55431bad 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -39,12 +39,14 @@ struct ins_operands { s64 offset; bool offset_avail; bool outside; + bool multi_regs; } target; union { struct { char *raw; char *name; u64 addr; + bool multi_regs; } source; struct { struct ins ins; From patchwork Thu Oct 12 03:50:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418211 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 43361A59; Thu, 12 Oct 2023 03:51:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="gwYQjkFo" Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20875BA; Wed, 11 Oct 2023 20:51:19 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1c9a1762b43so4585915ad.1; Wed, 11 Oct 2023 20:51:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082678; x=1697687478; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=njf9+BFlR0XFy6rQJ5ULpIMs5FEvIamhr8EAIYtLlEc=; b=gwYQjkFohXFsbXdbf1Zf8x4x5d+KwEv7/bxRXWBO8gPLiCUD0CxS2LRU6yO4K/9Owk 6+gJ2ma5o0+s2FLlW+TOpyXzW2g9Gs2cdi20oSyhLwNGtgEiK1lgwiVg8CsTazjR+daR wVnAxbTSFSArFUXk++aJ3PHQX81XWg2PLR9h5S8wL0U4ph+pg/xyGzNbv05uc6oMFdjv A5sX0D7Lz9ZagcQSTM4CLc9jlMWY2EKv6cXqfMwAoHnF2KZH/4szD5qd2rAL7Dji4OFI 8BidAkJafi7uooaKFP8Wbi60uTFwPGWhE7uRRsZ16kFIqU/cECREQhXmftXysuKhE/v7 V4AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082678; x=1697687478; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=njf9+BFlR0XFy6rQJ5ULpIMs5FEvIamhr8EAIYtLlEc=; b=frQbvgj0agOXbInXx1GBmFux/DT5nTp1AeIhGpN4+la3jpXZlhvau9ldfCUxJyD41Q dbCedi0BgwXAqv/VoX5+Mi1Zx0ry/+MhwFK8kQGQcUGKf8u1yWWqGGWyfb9JrkYKYVCk k/xEsrwDO5HszLz1RKso9ja+qUVuihLJuTbdSPNmlgOKk4ZbZ+Qs31gOhf8jEc0c7B/N Q0zRgY33ruDE01q+vuXO6jPUggfs5gqZboXKygob/07tBUdzc6T1XUAJcsHoHP5B4n1y BwwkUHsuidw0BwzIj5Yg+3Vzi2boR0bw9IaN9t8fea2uXh5qJG0W/xRgZujpe35Fd1KY oLXg== X-Gm-Message-State: AOJu0YxhmKvxIgtdISS5UAk8KgPfA0i19nYPYJ43Rfrlb0gd1aIqYNwL F0cJoAF3nSioAG+WDGia2ys= X-Google-Smtp-Source: AGHT+IEFeYhSWOXUIpZzRiUuBOT7G3Jw7zns/6jfrMxORHATP7ab0I3DApTwhqXFu4WoAtei9FNIoQ== X-Received: by 2002:a17:902:d4c2:b0:1c8:a68e:7ff with SMTP id o2-20020a170902d4c200b001c8a68e07ffmr13594550plg.60.1697082678415; Wed, 11 Oct 2023 20:51:18 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:18 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 03/48] perf tools: Add util/debuginfo.[ch] files Date: Wed, 11 Oct 2023 20:50:26 -0700 Message-ID: <20231012035111.676789-4-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Split debuginfo data structure and related functions into a separate file so that it can be used other than the probe-finder. Cc: Masami Hiramatsu Signed-off-by: Namhyung Kim --- tools/perf/util/Build | 1 + tools/perf/util/debuginfo.c | 205 +++++++++++++++++++++++++++++++++ tools/perf/util/debuginfo.h | 64 ++++++++++ tools/perf/util/probe-finder.c | 193 +------------------------------ tools/perf/util/probe-finder.h | 19 +-- 5 files changed, 272 insertions(+), 210 deletions(-) create mode 100644 tools/perf/util/debuginfo.c create mode 100644 tools/perf/util/debuginfo.h diff --git a/tools/perf/util/Build b/tools/perf/util/Build index 0ea5a9d368d4..a82122516720 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -194,6 +194,7 @@ endif perf-$(CONFIG_DWARF) += probe-finder.o perf-$(CONFIG_DWARF) += dwarf-aux.o perf-$(CONFIG_DWARF) += dwarf-regs.o +perf-$(CONFIG_DWARF) += debuginfo.o perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o perf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind-local.o diff --git a/tools/perf/util/debuginfo.c b/tools/perf/util/debuginfo.c new file mode 100644 index 000000000000..19acf4775d35 --- /dev/null +++ b/tools/perf/util/debuginfo.c @@ -0,0 +1,205 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * DWARF debug information handling code. Copied from probe-finder.c. + * + * Written by Masami Hiramatsu + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "build-id.h" +#include "dso.h" +#include "debug.h" +#include "debuginfo.h" +#include "symbol.h" + +#ifdef HAVE_DEBUGINFOD_SUPPORT +#include +#endif + +/* Dwarf FL wrappers */ +static char *debuginfo_path; /* Currently dummy */ + +static const Dwfl_Callbacks offline_callbacks = { + .find_debuginfo = dwfl_standard_find_debuginfo, + .debuginfo_path = &debuginfo_path, + + .section_address = dwfl_offline_section_address, + + /* We use this table for core files too. */ + .find_elf = dwfl_build_id_find_elf, +}; + +/* Get a Dwarf from offline image */ +static int debuginfo__init_offline_dwarf(struct debuginfo *dbg, + const char *path) +{ + GElf_Addr dummy; + int fd; + + fd = open(path, O_RDONLY); + if (fd < 0) + return fd; + + dbg->dwfl = dwfl_begin(&offline_callbacks); + if (!dbg->dwfl) + goto error; + + dwfl_report_begin(dbg->dwfl); + dbg->mod = dwfl_report_offline(dbg->dwfl, "", "", fd); + if (!dbg->mod) + goto error; + + dbg->dbg = dwfl_module_getdwarf(dbg->mod, &dbg->bias); + if (!dbg->dbg) + goto error; + + dwfl_module_build_id(dbg->mod, &dbg->build_id, &dummy); + + dwfl_report_end(dbg->dwfl, NULL, NULL); + + return 0; +error: + if (dbg->dwfl) + dwfl_end(dbg->dwfl); + else + close(fd); + memset(dbg, 0, sizeof(*dbg)); + + return -ENOENT; +} + +static struct debuginfo *__debuginfo__new(const char *path) +{ + struct debuginfo *dbg = zalloc(sizeof(*dbg)); + if (!dbg) + return NULL; + + if (debuginfo__init_offline_dwarf(dbg, path) < 0) + zfree(&dbg); + if (dbg) + pr_debug("Open Debuginfo file: %s\n", path); + return dbg; +} + +enum dso_binary_type distro_dwarf_types[] = { + DSO_BINARY_TYPE__FEDORA_DEBUGINFO, + DSO_BINARY_TYPE__UBUNTU_DEBUGINFO, + DSO_BINARY_TYPE__OPENEMBEDDED_DEBUGINFO, + DSO_BINARY_TYPE__BUILDID_DEBUGINFO, + DSO_BINARY_TYPE__MIXEDUP_UBUNTU_DEBUGINFO, + DSO_BINARY_TYPE__NOT_FOUND, +}; + +struct debuginfo *debuginfo__new(const char *path) +{ + enum dso_binary_type *type; + char buf[PATH_MAX], nil = '\0'; + struct dso *dso; + struct debuginfo *dinfo = NULL; + struct build_id bid; + + /* Try to open distro debuginfo files */ + dso = dso__new(path); + if (!dso) + goto out; + + /* Set the build id for DSO_BINARY_TYPE__BUILDID_DEBUGINFO */ + if (is_regular_file(path) && filename__read_build_id(path, &bid) > 0) + dso__set_build_id(dso, &bid); + + for (type = distro_dwarf_types; + !dinfo && *type != DSO_BINARY_TYPE__NOT_FOUND; + type++) { + if (dso__read_binary_type_filename(dso, *type, &nil, + buf, PATH_MAX) < 0) + continue; + dinfo = __debuginfo__new(buf); + } + dso__put(dso); + +out: + /* if failed to open all distro debuginfo, open given binary */ + return dinfo ? : __debuginfo__new(path); +} + +void debuginfo__delete(struct debuginfo *dbg) +{ + if (dbg) { + if (dbg->dwfl) + dwfl_end(dbg->dwfl); + free(dbg); + } +} + +/* For the kernel module, we need a special code to get a DIE */ +int debuginfo__get_text_offset(struct debuginfo *dbg, Dwarf_Addr *offs, + bool adjust_offset) +{ + int n, i; + Elf32_Word shndx; + Elf_Scn *scn; + Elf *elf; + GElf_Shdr mem, *shdr; + const char *p; + + elf = dwfl_module_getelf(dbg->mod, &dbg->bias); + if (!elf) + return -EINVAL; + + /* Get the number of relocations */ + n = dwfl_module_relocations(dbg->mod); + if (n < 0) + return -ENOENT; + /* Search the relocation related .text section */ + for (i = 0; i < n; i++) { + p = dwfl_module_relocation_info(dbg->mod, i, &shndx); + if (strcmp(p, ".text") == 0) { + /* OK, get the section header */ + scn = elf_getscn(elf, shndx); + if (!scn) + return -ENOENT; + shdr = gelf_getshdr(scn, &mem); + if (!shdr) + return -ENOENT; + *offs = shdr->sh_addr; + if (adjust_offset) + *offs -= shdr->sh_offset; + } + } + return 0; +} + +#ifdef HAVE_DEBUGINFOD_SUPPORT +int get_source_from_debuginfod(const char *raw_path, + const char *sbuild_id, char **new_path) +{ + debuginfod_client *c = debuginfod_begin(); + const char *p = raw_path; + int fd; + + if (!c) + return -ENOMEM; + + fd = debuginfod_find_source(c, (const unsigned char *)sbuild_id, + 0, p, new_path); + pr_debug("Search %s from debuginfod -> %d\n", p, fd); + if (fd >= 0) + close(fd); + debuginfod_end(c); + if (fd < 0) { + pr_debug("Failed to find %s in debuginfod (%s)\n", + raw_path, sbuild_id); + return -ENOENT; + } + pr_debug("Got a source %s\n", *new_path); + + return 0; +} +#endif /* HAVE_DEBUGINFOD_SUPPORT */ diff --git a/tools/perf/util/debuginfo.h b/tools/perf/util/debuginfo.h new file mode 100644 index 000000000000..4d65b8c605fc --- /dev/null +++ b/tools/perf/util/debuginfo.h @@ -0,0 +1,64 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef _PERF_DEBUGINFO_H +#define _PERF_DEBUGINFO_H + +#include +#include + +#ifdef HAVE_DWARF_SUPPORT + +#include "dwarf-aux.h" + +/* debug information structure */ +struct debuginfo { + Dwarf *dbg; + Dwfl_Module *mod; + Dwfl *dwfl; + Dwarf_Addr bias; + const unsigned char *build_id; +}; + +/* This also tries to open distro debuginfo */ +struct debuginfo *debuginfo__new(const char *path); +void debuginfo__delete(struct debuginfo *dbg); + +int debuginfo__get_text_offset(struct debuginfo *dbg, Dwarf_Addr *offs, + bool adjust_offset); + +#else /* HAVE_DWARF_SUPPORT */ + +/* dummy debug information structure */ +struct debuginfo { +}; + +static inline struct debuginfo *debuginfo__new(const char *path __maybe_unused) +{ + return NULL; +} + +static inline void debuginfo__delete(struct debuginfo *dbg __maybe_unused) +{ +} + +static inline int debuginfo__get_text_offset(struct debuginfo *dbg __maybe_unused, + Dwarf_Addr *offs __maybe_unused, + bool adjust_offset __maybe_unused) +{ + return -EINVAL; +} + +#endif /* HAVE_DWARF_SUPPORT */ + +#ifdef HAVE_DEBUGINFOD_SUPPORT +int get_source_from_debuginfod(const char *raw_path, const char *sbuild_id, + char **new_path); +#else /* HAVE_DEBUGINFOD_SUPPORT */ +static inline int get_source_from_debuginfod(const char *raw_path __maybe_unused, + const char *sbuild_id __maybe_unused, + char **new_path __maybe_unused) +{ + return -ENOTSUP; +} +#endif /* HAVE_DEBUGINFOD_SUPPORT */ + +#endif /* _PERF_DEBUGINFO_H */ diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c index f171360b0ef4..8d3dd85f9ff4 100644 --- a/tools/perf/util/probe-finder.c +++ b/tools/perf/util/probe-finder.c @@ -23,6 +23,7 @@ #include "event.h" #include "dso.h" #include "debug.h" +#include "debuginfo.h" #include "intlist.h" #include "strbuf.h" #include "strlist.h" @@ -31,128 +32,9 @@ #include "probe-file.h" #include "string2.h" -#ifdef HAVE_DEBUGINFOD_SUPPORT -#include -#endif - /* Kprobe tracer basic type is up to u64 */ #define MAX_BASIC_TYPE_BITS 64 -/* Dwarf FL wrappers */ -static char *debuginfo_path; /* Currently dummy */ - -static const Dwfl_Callbacks offline_callbacks = { - .find_debuginfo = dwfl_standard_find_debuginfo, - .debuginfo_path = &debuginfo_path, - - .section_address = dwfl_offline_section_address, - - /* We use this table for core files too. */ - .find_elf = dwfl_build_id_find_elf, -}; - -/* Get a Dwarf from offline image */ -static int debuginfo__init_offline_dwarf(struct debuginfo *dbg, - const char *path) -{ - GElf_Addr dummy; - int fd; - - fd = open(path, O_RDONLY); - if (fd < 0) - return fd; - - dbg->dwfl = dwfl_begin(&offline_callbacks); - if (!dbg->dwfl) - goto error; - - dwfl_report_begin(dbg->dwfl); - dbg->mod = dwfl_report_offline(dbg->dwfl, "", "", fd); - if (!dbg->mod) - goto error; - - dbg->dbg = dwfl_module_getdwarf(dbg->mod, &dbg->bias); - if (!dbg->dbg) - goto error; - - dwfl_module_build_id(dbg->mod, &dbg->build_id, &dummy); - - dwfl_report_end(dbg->dwfl, NULL, NULL); - - return 0; -error: - if (dbg->dwfl) - dwfl_end(dbg->dwfl); - else - close(fd); - memset(dbg, 0, sizeof(*dbg)); - - return -ENOENT; -} - -static struct debuginfo *__debuginfo__new(const char *path) -{ - struct debuginfo *dbg = zalloc(sizeof(*dbg)); - if (!dbg) - return NULL; - - if (debuginfo__init_offline_dwarf(dbg, path) < 0) - zfree(&dbg); - if (dbg) - pr_debug("Open Debuginfo file: %s\n", path); - return dbg; -} - -enum dso_binary_type distro_dwarf_types[] = { - DSO_BINARY_TYPE__FEDORA_DEBUGINFO, - DSO_BINARY_TYPE__UBUNTU_DEBUGINFO, - DSO_BINARY_TYPE__OPENEMBEDDED_DEBUGINFO, - DSO_BINARY_TYPE__BUILDID_DEBUGINFO, - DSO_BINARY_TYPE__MIXEDUP_UBUNTU_DEBUGINFO, - DSO_BINARY_TYPE__NOT_FOUND, -}; - -struct debuginfo *debuginfo__new(const char *path) -{ - enum dso_binary_type *type; - char buf[PATH_MAX], nil = '\0'; - struct dso *dso; - struct debuginfo *dinfo = NULL; - struct build_id bid; - - /* Try to open distro debuginfo files */ - dso = dso__new(path); - if (!dso) - goto out; - - /* Set the build id for DSO_BINARY_TYPE__BUILDID_DEBUGINFO */ - if (is_regular_file(path) && filename__read_build_id(path, &bid) > 0) - dso__set_build_id(dso, &bid); - - for (type = distro_dwarf_types; - !dinfo && *type != DSO_BINARY_TYPE__NOT_FOUND; - type++) { - if (dso__read_binary_type_filename(dso, *type, &nil, - buf, PATH_MAX) < 0) - continue; - dinfo = __debuginfo__new(buf); - } - dso__put(dso); - -out: - /* if failed to open all distro debuginfo, open given binary */ - return dinfo ? : __debuginfo__new(path); -} - -void debuginfo__delete(struct debuginfo *dbg) -{ - if (dbg) { - if (dbg->dwfl) - dwfl_end(dbg->dwfl); - free(dbg); - } -} - /* * Probe finder related functions */ @@ -1677,44 +1559,6 @@ int debuginfo__find_available_vars_at(struct debuginfo *dbg, return (ret < 0) ? ret : af.nvls; } -/* For the kernel module, we need a special code to get a DIE */ -int debuginfo__get_text_offset(struct debuginfo *dbg, Dwarf_Addr *offs, - bool adjust_offset) -{ - int n, i; - Elf32_Word shndx; - Elf_Scn *scn; - Elf *elf; - GElf_Shdr mem, *shdr; - const char *p; - - elf = dwfl_module_getelf(dbg->mod, &dbg->bias); - if (!elf) - return -EINVAL; - - /* Get the number of relocations */ - n = dwfl_module_relocations(dbg->mod); - if (n < 0) - return -ENOENT; - /* Search the relocation related .text section */ - for (i = 0; i < n; i++) { - p = dwfl_module_relocation_info(dbg->mod, i, &shndx); - if (strcmp(p, ".text") == 0) { - /* OK, get the section header */ - scn = elf_getscn(elf, shndx); - if (!scn) - return -ENOENT; - shdr = gelf_getshdr(scn, &mem); - if (!shdr) - return -ENOENT; - *offs = shdr->sh_addr; - if (adjust_offset) - *offs -= shdr->sh_offset; - } - } - return 0; -} - /* Reverse search */ int debuginfo__find_probe_point(struct debuginfo *dbg, u64 addr, struct perf_probe_point *ppt) @@ -2009,41 +1853,6 @@ int debuginfo__find_line_range(struct debuginfo *dbg, struct line_range *lr) return (ret < 0) ? ret : lf.found; } -#ifdef HAVE_DEBUGINFOD_SUPPORT -/* debuginfod doesn't require the comp_dir but buildid is required */ -static int get_source_from_debuginfod(const char *raw_path, - const char *sbuild_id, char **new_path) -{ - debuginfod_client *c = debuginfod_begin(); - const char *p = raw_path; - int fd; - - if (!c) - return -ENOMEM; - - fd = debuginfod_find_source(c, (const unsigned char *)sbuild_id, - 0, p, new_path); - pr_debug("Search %s from debuginfod -> %d\n", p, fd); - if (fd >= 0) - close(fd); - debuginfod_end(c); - if (fd < 0) { - pr_debug("Failed to find %s in debuginfod (%s)\n", - raw_path, sbuild_id); - return -ENOENT; - } - pr_debug("Got a source %s\n", *new_path); - - return 0; -} -#else -static inline int get_source_from_debuginfod(const char *raw_path __maybe_unused, - const char *sbuild_id __maybe_unused, - char **new_path __maybe_unused) -{ - return -ENOTSUP; -} -#endif /* * Find a src file from a DWARF tag path. Prepend optional source path prefix * and chop off leading directories that do not exist. Result is passed back as diff --git a/tools/perf/util/probe-finder.h b/tools/perf/util/probe-finder.h index 8bc1c80d3c1c..3add5ff516e1 100644 --- a/tools/perf/util/probe-finder.h +++ b/tools/perf/util/probe-finder.h @@ -24,21 +24,7 @@ static inline int is_c_varname(const char *name) #ifdef HAVE_DWARF_SUPPORT #include "dwarf-aux.h" - -/* TODO: export debuginfo data structure even if no dwarf support */ - -/* debug information structure */ -struct debuginfo { - Dwarf *dbg; - Dwfl_Module *mod; - Dwfl *dwfl; - Dwarf_Addr bias; - const unsigned char *build_id; -}; - -/* This also tries to open distro debuginfo */ -struct debuginfo *debuginfo__new(const char *path); -void debuginfo__delete(struct debuginfo *dbg); +#include "debuginfo.h" /* Find probe_trace_events specified by perf_probe_event from debuginfo */ int debuginfo__find_trace_events(struct debuginfo *dbg, @@ -49,9 +35,6 @@ int debuginfo__find_trace_events(struct debuginfo *dbg, int debuginfo__find_probe_point(struct debuginfo *dbg, u64 addr, struct perf_probe_point *ppt); -int debuginfo__get_text_offset(struct debuginfo *dbg, Dwarf_Addr *offs, - bool adjust_offset); - /* Find a line range */ int debuginfo__find_line_range(struct debuginfo *dbg, struct line_range *lr); From patchwork Thu Oct 12 03:50:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418212 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B3F4EBD; Thu, 12 Oct 2023 03:51:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LYW41OiI" Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71E20A9; Wed, 11 Oct 2023 20:51:20 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1c8a1541232so4988625ad.0; Wed, 11 Oct 2023 20:51:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082680; x=1697687480; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=y4sMEvu1pH+3RJOeI6urGWSTCcNMJ6GLwYLy9hdY0TY=; b=LYW41OiIeoV2GIz0psGTEaV/x1YQo3THtFP5mHXT6T8PlVmVvUhicWxOaJfcbUp3Dm x9spLCmwo731OAKUH0zPXIVTVcIHCxqJwlk1DUPzD20AFPK0lxupmiF0J9vCvxdxAwOi fCVqehsNJxCZFm+EHgg/QZq+G8mcjTDAvNMNlBUFO1yEXjiepVr3dketwaGQFHG6RawI ZD3m3ZbVV2JiQzZG0cGB/r9Z6XzyeAVZIKMAblszHKj044x8dQkrd7PgXYEUwr3QpqaC WjoA7XH6gPSSGhSGt/T47G/0XZbeqaEwCqNzGzr/ieXjDT80ENmu0M4L0GOFMyqQXA90 drVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082680; x=1697687480; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=y4sMEvu1pH+3RJOeI6urGWSTCcNMJ6GLwYLy9hdY0TY=; b=V8w938/6rAp/MlEeiN6sgEs0DINJtW93HQcigUPCktsi9/hab6YGK/yteQp5kPvjrE wdjdidEWhl6JFerFpv/1WKX/yCzOO8ufbvrzQr0go4TW+n58+kjG8OXcSkdQCpX5bmDs hNsSAjaCla4QB5q45nfTfShk48EDzNM1ggQX6RNM/fy4roYAmuMZO2m+gdFT95cMMdNB xHpO92N4xn95WmvLWMUbZpwKFnGorIrzGEH7kaVUIiNJTlCGmwYIJC5jyGAB63iepZAj ZuVlWPYT6d6+fS1gtGqNbMhktjOy85et8nFhwesvZh6OzDEW4m0MhunA/j4A/3MHbztu /q0g== X-Gm-Message-State: AOJu0Yxm55eYD6DD4+zlRHD85bbZEwF/s0rii5ZvHZXLgwP//Bz14HOX cbmlhLcQvgQRL6d81aIbqUO419ZApTI= X-Google-Smtp-Source: AGHT+IG2ziFdYY4l2DxOHPyhPyJxsj/pXWB6ID7xRNKie6lBmqraGNqSl49oXXFcwJ9t92eDe3aIRg== X-Received: by 2002:a17:903:228f:b0:1c7:1fbc:b9e7 with SMTP id b15-20020a170903228f00b001c71fbcb9e7mr27349370plh.43.1697082679910; Wed, 11 Oct 2023 20:51:19 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:19 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 04/48] perf dwarf-aux: Fix die_get_typename() for void * Date: Wed, 11 Oct 2023 20:50:27 -0700 Message-ID: <20231012035111.676789-5-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The die_get_typename() is to return a C-like type name from DWARF debug entry and it follows data type if the target entry is a pointer type. But I found void pointers don't have the type attribte to follow and then the function returns an error for that case. This results in a broken type string for void pointer types. For example, the following type entries are pointer types. <1><48c>: Abbrev Number: 4 (DW_TAG_pointer_type) <48d> DW_AT_byte_size : 8 <48d> DW_AT_type : <0x481> <1><491>: Abbrev Number: 211 (DW_TAG_pointer_type) <493> DW_AT_byte_size : 8 <1><494>: Abbrev Number: 4 (DW_TAG_pointer_type) <495> DW_AT_byte_size : 8 <495> DW_AT_type : <0x49e> The first one at offset 48c and the third one at offset 494 have type information. Then they are pointer types for the referenced types. But the second one at offset 491 doesn't have the type attribute. Cc: Masami Hiramatsu Signed-off-by: Namhyung Kim Acked-by: Masami Hiramatsu (Google) --- tools/perf/util/dwarf-aux.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index 2941d88f2199..4849c3bbfd95 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -1090,7 +1090,14 @@ int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf) return strbuf_addf(buf, "%s%s", tmp, name ?: ""); } ret = die_get_typename(&type, buf); - return ret ? ret : strbuf_addstr(buf, tmp); + if (ret < 0) { + /* void pointer has no type attribute */ + if (tag == DW_TAG_pointer_type && ret == -ENOENT) + return strbuf_addf(buf, "void*"); + + return ret; + } + return strbuf_addstr(buf, tmp); } /** From patchwork Thu Oct 12 03:50:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418213 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F0CC2EC2; Thu, 12 Oct 2023 03:51:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hJlN7VF5" Received: from mail-pl1-x62d.google.com (mail-pl1-x62d.google.com [IPv6:2607:f8b0:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0D57B6; Wed, 11 Oct 2023 20:51:21 -0700 (PDT) Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1bdf4752c3cso4011205ad.2; Wed, 11 Oct 2023 20:51:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082681; x=1697687481; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=cPOXjwmN3ZMMh06n/p0NGkvvVnZrqkqOb37OFAMvoL8=; b=hJlN7VF5lkojISEEEE6JFzje5Q47d3eIQggM9OIUaVWObmzB4vkNaqkL4VgJj742wO S27+8AvdrQYQnRDjhhbaYM8TdXCKq4zBBXtnWRbTnO47uj6FkoaW6F92XoxoEUa6w/Qp HybNbypvFinrzXbqcOgBnLXY90Vm73C4GuLAadcXYLjCcMoPzf75ttS5vkvybjDFzUK5 lkCJhxCK5f5bIHjjppHjWcM+Tapv6ULUOA6N0jAaxdjYUjonUzcSnLK7G4gMVXeasq+/ yzRW+oiUcua5J2JcQAlL5/pMF2fF5QS0crlb1gQxVJIKSh3UHWU9GtqhWD9A8qqA1j5T Otqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082681; x=1697687481; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=cPOXjwmN3ZMMh06n/p0NGkvvVnZrqkqOb37OFAMvoL8=; b=Od9VgrU0q22XG0PFokq41TZbbhl+ahO0m/j1UkbqtQ7ZExuO0s7uc86cMjgHwgKWM/ rGuNrXFOJ2anQuzDyS/yd1InOr9mR0O3pcGuIItkW1YXTrmyyhO/7kScq7fjeJs8Qlo+ vRPQRkAq43begNmXr9UhR8Ztyzu+/r6aZvcLRhBH9sZf4YhFIGYNP+loXu6kcQP1x72I 6IGMnMnVoshbGniZnsCz1rcBT2xvUnW0myZzh01yaJprPqhlP2YrViN3kgc5vdobP9kL jmh5+3F3hznke58EopsYXT5eeK/S9YQrUNTp5TBLvAqpMUVClqERxCoHFhEftC46s5u2 n70g== X-Gm-Message-State: AOJu0Yzo6OHIZpySHRuSi2lG3F8rBoRgT9jsXktMmsaUcLACYzi7VCHT TpKg7nb273stfkfLXf7dNws= X-Google-Smtp-Source: AGHT+IE4CgDyywX74RJCXXMuhTOVqCARwR6R15YcE3PND7DYtjgY7/8K/ip2EFBFfMLWNx0PDchnUQ== X-Received: by 2002:a17:902:dacd:b0:1c0:c174:3695 with SMTP id q13-20020a170902dacd00b001c0c1743695mr23283409plx.13.1697082681229; Wed, 11 Oct 2023 20:51:21 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:20 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 05/48] perf dwarf-aux: Move #ifdef code to the header file Date: Wed, 11 Oct 2023 20:50:28 -0700 Message-ID: <20231012035111.676789-6-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net It's a usual convention that the conditional code is handled in a header file. As I'm planning to add some more of them, let's move the current code to the header first. Cc: Masami Hiramatsu Signed-off-by: Namhyung Kim Acked-by: Masami Hiramatsu (Google) --- tools/perf/util/dwarf-aux.c | 7 ------- tools/perf/util/dwarf-aux.h | 19 +++++++++++++++++-- 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index 4849c3bbfd95..adef2635587d 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -1245,13 +1245,6 @@ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf) out: return ret; } -#else -int die_get_var_range(Dwarf_Die *sp_die __maybe_unused, - Dwarf_Die *vr_die __maybe_unused, - struct strbuf *buf __maybe_unused) -{ - return -ENOTSUP; -} #endif /* diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h index 7ec8bc1083bb..4f5d0211ee4f 100644 --- a/tools/perf/util/dwarf-aux.h +++ b/tools/perf/util/dwarf-aux.h @@ -121,7 +121,6 @@ int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf); /* Get the name and type of given variable DIE, stored as "type\tname" */ int die_get_varname(Dwarf_Die *vr_die, struct strbuf *buf); -int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf); /* Check if target program is compiled with optimization */ bool die_is_optimized_target(Dwarf_Die *cu_die); @@ -130,4 +129,20 @@ bool die_is_optimized_target(Dwarf_Die *cu_die); void die_skip_prologue(Dwarf_Die *sp_die, Dwarf_Die *cu_die, Dwarf_Addr *entrypc); -#endif +#ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT + +/* Get byte offset range of given variable DIE */ +int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf); + +#else /* HAVE_DWARF_GETLOCATIONS_SUPPORT */ + +static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused, + Dwarf_Die *vr_die __maybe_unused, + struct strbuf *buf __maybe_unused) +{ + return -ENOTSUP; +} + +#endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */ + +#endif /* _DWARF_AUX_H */ From patchwork Thu Oct 12 03:50:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418214 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DCCA5ECF; Thu, 12 Oct 2023 03:51:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ITRuwuGQ" Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1569EC6; Wed, 11 Oct 2023 20:51:23 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1c871a095ceso4549055ad.2; Wed, 11 Oct 2023 20:51:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082682; x=1697687482; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=oDCRLN//GDrGJp7khI5N6nKLZqiZJEAgcyWXlKtsgWI=; b=ITRuwuGQsdVJsSJtABcm3Elp0k/+SOu+TjSj5C20R+F1Y1EikCShpvWvMtYIIEiDQS EaJ3djTVa9vO8UFQKZVP+M4SOSTjiJdNgN2XpUmfQ+l17jpKaws297C+8j4uCkvSqFC9 BMGFYHoRt09gyftMoELYY45ZGAkM/8kI0P7NuYprsgrr5BE8JAWKBmGvWQriDbXb6AjX Sif4ZIlcIVYBEvx+1GBRk6DxUQUP5E0T19Ki9ToSX7qm5stLrK4HICA2gblmW8yk23He 2Jxnmt8XDXxuFCNsvQMZHzucHo9bUCj3IC06iWgMQHX6BnO4tMvKhXUbB5THEsm/Qi4M fe9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082682; x=1697687482; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=oDCRLN//GDrGJp7khI5N6nKLZqiZJEAgcyWXlKtsgWI=; b=CU3SXSjivXCt/rKBVsZslvhlpGq9h3lb6m1TuRnMHsfTmOQ8FQQo6Bdt0PMaQKReHP sbt32EBnkMb1B2mBZAPD1vzDVl3zbGgE1T/vPXl40H8im8CuLVsphw3R4b8MwFPLgy2f Osgi+zrlWXRpcMGEMgfL85LM1+8opG9g9yEluM5lBKqchC+kxSCifo2KRDG63kgrdUHb +NT+4gYBsG18V7KgcBvlAPlQTlJXnc6mLDbvsjVEwz/jmKGTPZZNMgYbhQRU4wRB6U0g bH4yEqNXyB/Y6A9+5y9/o40Q9KszEhG0kQy9gFfB+YxgMXTJoVj9+qkbZnkkqdjSzOj7 TmkA== X-Gm-Message-State: AOJu0YzDFVClPfRWpT8LJ1a8sgN7X9BOTqPZISzIeLJdHbVDRL37YN/1 7s4nkGX7wkEXKnwBWaTc5s8= X-Google-Smtp-Source: AGHT+IHdYZPZ7jxW9iCz5UaWHuPYiF5gqvs27HuXcrwTz+rGQzDjAb2lw+Tki39iGPe9p1g2++2Meg== X-Received: by 2002:a17:903:41c9:b0:1c5:f1fd:5da with SMTP id u9-20020a17090341c900b001c5f1fd05damr24713893ple.2.1697082682477; Wed, 11 Oct 2023 20:51:22 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:22 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 06/48] perf dwarf-aux: Add die_get_scopes() helper Date: Wed, 11 Oct 2023 20:50:29 -0700 Message-ID: <20231012035111.676789-7-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The die_get_scopes() would return the number of enclosing DIEs for the given address and it fills an array of DIEs like dwarf_getscopes(). But it doesn't follow the abstract origin of inlined functions as we want information of the concrete instance. This is needed to check the location of parameters and local variables properly. Users can check the origin separately if needed. Cc: Masami Hiramatsu Signed-off-by: Namhyung Kim Acked-by: Masami Hiramatsu (Google) --- tools/perf/util/dwarf-aux.c | 53 +++++++++++++++++++++++++++++++++++++ tools/perf/util/dwarf-aux.h | 3 +++ 2 files changed, 56 insertions(+) diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index adef2635587d..10aa32334d6f 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -1425,3 +1425,56 @@ void die_skip_prologue(Dwarf_Die *sp_die, Dwarf_Die *cu_die, *entrypc = postprologue_addr; } + +/* Internal parameters for __die_find_scope_cb() */ +struct find_scope_data { + /* Target instruction address */ + Dwarf_Addr pc; + /* Number of scopes found [output] */ + int nr; + /* Array of scopes found, 0 for the outermost one. [output] */ + Dwarf_Die *scopes; +}; + +static int __die_find_scope_cb(Dwarf_Die *die_mem, void *arg) +{ + struct find_scope_data *data = arg; + + if (dwarf_haspc(die_mem, data->pc)) { + Dwarf_Die *tmp; + + tmp = realloc(data->scopes, (data->nr + 1) * sizeof(*tmp)); + if (tmp == NULL) + return DIE_FIND_CB_END; + + memcpy(tmp + data->nr, die_mem, sizeof(*die_mem)); + data->scopes = tmp; + data->nr++; + return DIE_FIND_CB_CHILD; + } + return DIE_FIND_CB_SIBLING; +} + +/** + * die_get_scopes - Return a list of scopes including the address + * @cu_die: a compile unit DIE + * @pc: the address to find + * @scopes: the array of DIEs for scopes (result) + * + * This function does the same as the dwarf_getscopes() but doesn't follow + * the origins of inlined functions. It returns the number of scopes saved + * in the @scopes argument. The outer scope will be saved first (index 0) and + * the last one is the innermost scope at the @pc. + */ +int die_get_scopes(Dwarf_Die *cu_die, Dwarf_Addr pc, Dwarf_Die **scopes) +{ + struct find_scope_data data = { + .pc = pc, + }; + Dwarf_Die die_mem; + + die_find_child(cu_die, __die_find_scope_cb, &data, &die_mem); + + *scopes = data.scopes; + return data.nr; +} diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h index 4f5d0211ee4f..f9d765f80fb0 100644 --- a/tools/perf/util/dwarf-aux.h +++ b/tools/perf/util/dwarf-aux.h @@ -129,6 +129,9 @@ bool die_is_optimized_target(Dwarf_Die *cu_die); void die_skip_prologue(Dwarf_Die *sp_die, Dwarf_Die *cu_die, Dwarf_Addr *entrypc); +/* Get the list of including scopes */ +int die_get_scopes(Dwarf_Die *cu_die, Dwarf_Addr pc, Dwarf_Die **scopes); + #ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT /* Get byte offset range of given variable DIE */ From patchwork Thu Oct 12 03:50:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418215 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1FF30EA3; Thu, 12 Oct 2023 03:51:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bWvj/1R3" Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D35BCC; Wed, 11 Oct 2023 20:51:24 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1c60f1a2652so4037085ad.0; Wed, 11 Oct 2023 20:51:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082684; x=1697687484; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=iJAPgps5vTes2Y2qx7RQDKoRh92FtNMtdWH/CZXcwx0=; b=bWvj/1R3SWPRqd2i3KlTv8jDqSzj9TljduYQGyEpLscC8Lk2nwEuyTPxggWCk2IhmW rqeCeztkeqAfvkNV9jI2TZM35d7OKUJEyYxcMG8Cwc6WZC3o3vxapoI6lFaU6RoZyQXj YxIMR33ul3z4xRa1PjEUhXnWDCDP6Q5XLFjlRwzeKszIDyPhdOodZdDQEMJVzXcjcTlt CXwZBHkkzX0dmEu56tm6QVXZ9LRJHXw0GJK2fGg2qmBS3+fMoYdOVqIdQFT5rlFXlvKD vjKfAgO9QbGJ1r5WRSkWHCu/J3BatqTVMgZ0AYxhQ31xr0+UaMyfXqzd/OusbwX8XuM9 2K9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082684; x=1697687484; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=iJAPgps5vTes2Y2qx7RQDKoRh92FtNMtdWH/CZXcwx0=; b=wcvcXDZ1WDEd9T6PE2EaI1CQjpNcTT11emSn4yu486T37YfIkoOigNPKQiR+4nla4D dBCMbcwhJOnmzjRoHIBLu+ZrmRNDAohwXLErRA4rSXMBsRN1DvB7ksgMiD9NGdivCdzQ 65Oz3RdeJiI+V8o2m6S3VQTARP1J5uSh1uMf2C8VRtg5uS7lq4iuyHKv2WRNN+i6NA4/ Nk7l4xD7KBU7bKVRB8xy1HGOlcr+WFoNXjoUFMLwVEGI0CzFEm/K76vPHUOauFQIIvJD 6EpCMEXRlvGPS4hho8XM4XXYX5PrBvB6sIfaZu7puJnHQqXiUmNAlCjXVmAZs7mHkJ5O mmuA== X-Gm-Message-State: AOJu0YwuaDuXvqh/qENHpDeiaiXhMG0xX5e7kEvIie8QouS2+5cvKaN0 fl9EmK2Fp2SDy35WfQlRnb0= X-Google-Smtp-Source: AGHT+IHA2NlxKsGKl2muwRYeArO/59rj3wAkwqkfeISU9JvT2XBOHWaXkyl4VroGUHqIQaihm5Lruw== X-Received: by 2002:a17:903:2449:b0:1b6:a37a:65b7 with SMTP id l9-20020a170903244900b001b6a37a65b7mr31662151pls.23.1697082683762; Wed, 11 Oct 2023 20:51:23 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:23 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 07/48] perf dwarf-aux: Add die_find_variable_by_reg() helper Date: Wed, 11 Oct 2023 20:50:30 -0700 Message-ID: <20231012035111.676789-8-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The die_find_variable_by_reg() will search for a variable or a parameter sub-DIE in the given scope DIE where the location matches to the given register. For the simpliest and most common case, memory access usually happens with a base register and an offset to the field so the register would hold a pointer in a variable or function parameter. Then we can find one if it has a location expression at the (instruction) address. So this function only handles such a simple case for now. In this case, the expression would have a DW_OP_regN operation where N < 32. If the register index (N) is greater than or equal to 32, DW_OP_regx operation with an operand which saves the value for the N would be used. It would reject expressions with more operations. Cc: Masami Hiramatsu Signed-off-by: Namhyung Kim Acked-by: Masami Hiramatsu (Google) --- tools/perf/util/dwarf-aux.c | 67 +++++++++++++++++++++++++++++++++++++ tools/perf/util/dwarf-aux.h | 12 +++++++ 2 files changed, 79 insertions(+) diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index 10aa32334d6f..652e6e7368a2 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -1245,6 +1245,73 @@ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf) out: return ret; } + +/* Interval parameters for __die_find_var_reg_cb() */ +struct find_var_data { + /* Target instruction address */ + Dwarf_Addr pc; + /* Target register */ + unsigned reg; +}; + +/* Max number of registers DW_OP_regN supports */ +#define DWARF_OP_DIRECT_REGS 32 + +/* Only checks direct child DIEs in the given scope. */ +static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg) +{ + struct find_var_data *data = arg; + int tag = dwarf_tag(die_mem); + ptrdiff_t off = 0; + Dwarf_Attribute attr; + Dwarf_Addr base, start, end; + Dwarf_Op *ops; + size_t nops; + + if (tag != DW_TAG_variable && tag != DW_TAG_formal_parameter) + return DIE_FIND_CB_SIBLING; + + if (dwarf_attr(die_mem, DW_AT_location, &attr) == NULL) + return DIE_FIND_CB_SIBLING; + + while ((off = dwarf_getlocations(&attr, off, &base, &start, &end, &ops, &nops)) > 0) { + /* Assuming the location list is sorted by address */ + if (end < data->pc) + continue; + if (start > data->pc) + break; + + /* Only match with a simple case */ + if (data->reg < DWARF_OP_DIRECT_REGS) { + if (ops->atom == (DW_OP_reg0 + data->reg) && nops == 1) + return DIE_FIND_CB_END; + } else { + if (ops->atom == DW_OP_regx && ops->number == data->reg && + nops == 1) + return DIE_FIND_CB_END; + } + } + return DIE_FIND_CB_SIBLING; +} + +/** + * die_find_variable_by_reg - Find a variable saved in a register + * @sc_die: a scope DIE + * @pc: the program address to find + * @reg: the register number to find + * @die_mem: a buffer to save the resulting DIE + * + * Find the variable DIE accessed by the given register. + */ +Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg, + Dwarf_Die *die_mem) +{ + struct find_var_data data = { + .pc = pc, + .reg = reg, + }; + return die_find_child(sc_die, __die_find_var_reg_cb, &data, die_mem); +} #endif /* diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h index f9d765f80fb0..b6f430730bd1 100644 --- a/tools/perf/util/dwarf-aux.h +++ b/tools/perf/util/dwarf-aux.h @@ -137,6 +137,10 @@ int die_get_scopes(Dwarf_Die *cu_die, Dwarf_Addr pc, Dwarf_Die **scopes); /* Get byte offset range of given variable DIE */ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf); +/* Find a variable saved in the 'reg' at given address */ +Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg, + Dwarf_Die *die_mem); + #else /* HAVE_DWARF_GETLOCATIONS_SUPPORT */ static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused, @@ -146,6 +150,14 @@ static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused, return -ENOTSUP; } +static inline Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die __maybe_unused, + Dwarf_Addr pc __maybe_unused, + int reg __maybe_unused, + Dwarf_Die *die_mem __maybe_unused) +{ + return NULL; +} + #endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */ #endif /* _DWARF_AUX_H */ From patchwork Thu Oct 12 03:50:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418216 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DA430812; Thu, 12 Oct 2023 03:51:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="EaYaIaia" Received: from mail-pg1-x52f.google.com (mail-pg1-x52f.google.com [IPv6:2607:f8b0:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D372BD8; Wed, 11 Oct 2023 20:51:25 -0700 (PDT) Received: by mail-pg1-x52f.google.com with SMTP id 41be03b00d2f7-5859a7d6556so417237a12.0; Wed, 11 Oct 2023 20:51:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082685; x=1697687485; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=aT8z3RMvM5F26FD3Kbo/1bwOpEfsadnhZEpHV6IP5pg=; b=EaYaIaiay2u/pclrS1KXWN1dujxHAjVbZRaapLT2tzZsFgsiR4WuS2nnseUMcmY53f XEsCK7hDtsGNQNfZL9NSTpA0Jl8GP0jqQSa5AIRaOhztOmhQt16dlOitvwaZ6HYbiVQ0 Q4belOW36ISdcgFImYgrwysaNImBRpfiLVC+IKlGG3u41CUtKhWDTa9k3jvy1BsaOvvn DzQJbk+Xq/LLwrPWTh9D+au8fHMrfdpC/aMN6ujYfA6EKaZN/OQ5lKC9xHtwASY30U6y sdQPlaPlu7wzShGLd3HQRd9ZuGW3dXgF6KzKtWF2lZOVOcvHRr7EkbZDAgH2n9lhjZdr 8I7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082685; x=1697687485; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=aT8z3RMvM5F26FD3Kbo/1bwOpEfsadnhZEpHV6IP5pg=; b=mRTwPZ0Gy9gzWkSlNSSs3xy/uqvfX/EYjc37R2I2WENSfjk+9ghBO90X5iXxcxd0DU aBLLjSFADxpcyI9iUK6DSStbA3yOtVIXo60Ki6whLNJqYhaYdO2Tg2/9DpGEGy5jzFCT wzQO+Q9Q3+ngQP/S6Poyi0V+GpL9Hb/O+C0M58y+ctHFaETo9foo7QHHaVV8/OIrKLtZ RmOFqRbEjt+MXHRVTQduLJuntcuXLTGATzE1CgIJ9Xks0l+lk8t6M8Olwl5d+ZXtHFPn 2isIJ3P9DPDtDK+5vs9jmtmkY134ZcW7Boag8QEqgbPvOlO9/Nkw5Wjl33dke3IpJP4o jlZw== X-Gm-Message-State: AOJu0YxSyFe55VqcZgQvcdjOMGcyERCh6S50gmRPB2i3np63l5wyBIIt a7a0H+anJBhw6RTDIFcjKTs6iEx7Rs8= X-Google-Smtp-Source: AGHT+IG134Ei6UM+lAX7WchV1fDDjuJo11PdahSufKGcv4xgIfpiBosiowsTg7DvvCUjXRSTsk1WWw== X-Received: by 2002:a05:6a20:72a2:b0:133:bbe0:312f with SMTP id o34-20020a056a2072a200b00133bbe0312fmr28190854pzk.50.1697082685172; Wed, 11 Oct 2023 20:51:25 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:24 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 08/48] perf dwarf-aux: Factor out __die_get_typename() Date: Wed, 11 Oct 2023 20:50:31 -0700 Message-ID: <20231012035111.676789-9-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The __die_get_typename() is to get the name of the given DIE in C-style type name. The difference from the die_get_typename() is that it does not retrieve the DW_AT_type and use the given DIE directly. This will be used when users know the type DIE already. Cc: Masami Hiramatsu Signed-off-by: Namhyung Kim --- tools/perf/util/dwarf-aux.c | 38 ++++++++++++++++++++++++++----------- tools/perf/util/dwarf-aux.h | 3 +++ 2 files changed, 30 insertions(+), 11 deletions(-) diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index 652e6e7368a2..5bb05c84d249 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -1051,32 +1051,28 @@ Dwarf_Die *die_find_member(Dwarf_Die *st_die, const char *name, } /** - * die_get_typename - Get the name of given variable DIE - * @vr_die: a variable DIE + * __die_get_typename - Get the name of given type DIE + * @type: a type DIE * @buf: a strbuf for result type name * - * Get the name of @vr_die and stores it to @buf. Return 0 if succeeded. + * Get the name of @type_die and stores it to @buf. Return 0 if succeeded. * and Return -ENOENT if failed to find type name. * Note that the result will stores typedef name if possible, and stores * "*(function_type)" if the type is a function pointer. */ -int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf) +int __die_get_typename(Dwarf_Die *type, struct strbuf *buf) { - Dwarf_Die type; int tag, ret; const char *tmp = ""; - if (__die_get_real_type(vr_die, &type) == NULL) - return -ENOENT; - - tag = dwarf_tag(&type); + tag = dwarf_tag(type); if (tag == DW_TAG_array_type || tag == DW_TAG_pointer_type) tmp = "*"; else if (tag == DW_TAG_subroutine_type) { /* Function pointer */ return strbuf_add(buf, "(function_type)", 15); } else { - const char *name = dwarf_diename(&type); + const char *name = dwarf_diename(type); if (tag == DW_TAG_union_type) tmp = "union "; @@ -1089,7 +1085,7 @@ int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf) /* Write a base name */ return strbuf_addf(buf, "%s%s", tmp, name ?: ""); } - ret = die_get_typename(&type, buf); + ret = die_get_typename(type, buf); if (ret < 0) { /* void pointer has no type attribute */ if (tag == DW_TAG_pointer_type && ret == -ENOENT) @@ -1100,6 +1096,26 @@ int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf) return strbuf_addstr(buf, tmp); } +/** + * die_get_typename - Get the name of given variable DIE + * @vr_die: a variable DIE + * @buf: a strbuf for result type name + * + * Get the name of @vr_die and stores it to @buf. Return 0 if succeeded. + * and Return -ENOENT if failed to find type name. + * Note that the result will stores typedef name if possible, and stores + * "*(function_type)" if the type is a function pointer. + */ +int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf) +{ + Dwarf_Die type; + + if (__die_get_real_type(vr_die, &type) == NULL) + return -ENOENT; + + return __die_get_typename(&type, buf); +} + /** * die_get_varname - Get the name and type of given variable DIE * @vr_die: a variable DIE diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h index b6f430730bd1..574405c57d3b 100644 --- a/tools/perf/util/dwarf-aux.h +++ b/tools/perf/util/dwarf-aux.h @@ -116,6 +116,9 @@ Dwarf_Die *die_find_variable_at(Dwarf_Die *sp_die, const char *name, Dwarf_Die *die_find_member(Dwarf_Die *st_die, const char *name, Dwarf_Die *die_mem); +/* Get the name of given type DIE */ +int __die_get_typename(Dwarf_Die *type, struct strbuf *buf); + /* Get the name of given variable DIE */ int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf); From patchwork Thu Oct 12 03:50:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418217 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7791BEBC; Thu, 12 Oct 2023 03:51:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="h19z0eyT" Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2BC55DD; Wed, 11 Oct 2023 20:51:27 -0700 (PDT) Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-1c9c5a1b87bso4540375ad.3; Wed, 11 Oct 2023 20:51:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082686; x=1697687486; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=XgmQ5z7I08IIuyO3b0o16ThTHSJVHyVNZfchHy73NpU=; b=h19z0eyTr7Rt/xhpgrI30p/cM6kH7OXYgetaBmkePX0h6S619uXV3DI0rmcTbjmsN7 xYARZiGsifMlzHBF5iHSY995LTWsSSXJQ0hETwkHTuIjb07XgSDz8ZVvbqJXme4+J0LM XW81FFzOObp+f/MjT7hzeJ1KpT8IiU1WabjddIN8GiCr/6Ps+Xd/kX2ziT2a1Aq2d05G +teAohyskn04ouASLS3mk1EdU9V2pUNW+OrS+LdDkU2HuU8dyyL1MsoZj0k6CNZO2rWo qQoX55CPd02PC+XujP6sjLT6tMYwqH3+4YOB2SPd46mNCAtE+TMeEjl87U8kyIYKvJ+M kDKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082686; x=1697687486; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=XgmQ5z7I08IIuyO3b0o16ThTHSJVHyVNZfchHy73NpU=; b=nAI0xpFtn08ZiZnTxzc0NNEYb+2KAh/LSqtyELZOMDaFz1y4AeGaBQ6BKJAUyRg7m1 B1fBehy9JjpkZlfca/tLi8UR39IXsjc+VMS9IDUYLkEKzhPBoUhNly7F/Ov/z357Tx3U JElBHWW6r3Jmr9Txr9BMdLHpc9BxRlgE3rDxeVh3DHVxOfg8tP0XgPhO9Jw3DD+dnCxo fqr01gBznN+akp7mrC2bXU6JKNuzNGhycGebWNtv+4WiPt5V5d+0Qxbsh8myzByeBW3t wJHBsyFUyffJfVSJ0ZtXeaPMI8aQ9O8cNeDROYh00Kem3IRyiEcXnPwqc0NuJ22lPj3c etxw== X-Gm-Message-State: AOJu0YxllBa1cMV5Tmvr5443dbM4nAcOxUezBywgXhRieEig75bCdzyY F/5wvQD8DIOgQcCSMcR0Dhc= X-Google-Smtp-Source: AGHT+IEfN/7o80IKkOr3BKCmUuBcSLZpDzoGm8qmkwAWmp5kUkK9Nl48yVkHeNZg0kQGELWC16EDeQ== X-Received: by 2002:a17:902:b48c:b0:1c0:a5c9:e072 with SMTP id y12-20020a170902b48c00b001c0a5c9e072mr18396658plr.11.1697082686553; Wed, 11 Oct 2023 20:51:26 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:26 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 09/48] perf dwarf-regs: Add get_dwarf_regnum() Date: Wed, 11 Oct 2023 20:50:32 -0700 Message-ID: <20231012035111.676789-10-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The get_dwarf_regnum() returns a DWARF register number from a register name string according to the psABI. Also add two pseudo encodings of DWARF_REG_PC which is a register that are used by PC-relative addressing and DWARF_REG_FB which is a frame base register. They need to be handled in a special way. Cc: Masami Hiramatsu Signed-off-by: Namhyung Kim --- tools/perf/arch/x86/util/dwarf-regs.c | 38 +++++++++++++++++++++++++++ tools/perf/util/dwarf-regs.c | 33 +++++++++++++++++++++++ tools/perf/util/include/dwarf-regs.h | 11 ++++++++ 3 files changed, 82 insertions(+) diff --git a/tools/perf/arch/x86/util/dwarf-regs.c b/tools/perf/arch/x86/util/dwarf-regs.c index 530934805710..79835b897cae 100644 --- a/tools/perf/arch/x86/util/dwarf-regs.c +++ b/tools/perf/arch/x86/util/dwarf-regs.c @@ -113,3 +113,41 @@ int regs_query_register_offset(const char *name) return roff->offset; return -EINVAL; } + +struct dwarf_regs_idx { + const char *name; + int idx; +}; + +static const struct dwarf_regs_idx x86_regidx_table[] = { + { "rax", 0 }, { "eax", 0 }, { "ax", 0 }, { "al", 0 }, + { "rdx", 1 }, { "edx", 1 }, { "dx", 1 }, { "dl", 1 }, + { "rcx", 2 }, { "ecx", 2 }, { "cx", 2 }, { "cl", 2 }, + { "rbx", 3 }, { "edx", 3 }, { "bx", 3 }, { "bl", 3 }, + { "rsi", 4 }, { "esi", 4 }, { "si", 4 }, { "sil", 4 }, + { "rdi", 5 }, { "edi", 5 }, { "di", 5 }, { "dil", 5 }, + { "rbp", 6 }, { "ebp", 6 }, { "bp", 6 }, { "bpl", 6 }, + { "rsp", 7 }, { "esp", 7 }, { "sp", 7 }, { "spl", 7 }, + { "r8", 8 }, { "r8d", 8 }, { "r8w", 8 }, { "r8b", 8 }, + { "r9", 9 }, { "r9d", 9 }, { "r9w", 9 }, { "r9b", 9 }, + { "r10", 10 }, { "r10d", 10 }, { "r10w", 10 }, { "r10b", 10 }, + { "r11", 11 }, { "r11d", 11 }, { "r11w", 11 }, { "r11b", 11 }, + { "r12", 12 }, { "r12d", 12 }, { "r12w", 12 }, { "r12b", 12 }, + { "r13", 13 }, { "r13d", 13 }, { "r13w", 13 }, { "r13b", 13 }, + { "r14", 14 }, { "r14d", 14 }, { "r14w", 14 }, { "r14b", 14 }, + { "r15", 15 }, { "r15d", 15 }, { "r15w", 15 }, { "r15b", 15 }, + { "rip", DWARF_REG_PC }, +}; + +int get_arch_regnum(const char *name) +{ + unsigned int i; + + if (*name != '%') + return -1; + + for (i = 0; i < ARRAY_SIZE(x86_regidx_table); i++) + if (!strcmp(x86_regidx_table[i].name, name + 1)) + return x86_regidx_table[i].idx; + return -1; +} diff --git a/tools/perf/util/dwarf-regs.c b/tools/perf/util/dwarf-regs.c index 69cfaa5953bf..28d786c7df55 100644 --- a/tools/perf/util/dwarf-regs.c +++ b/tools/perf/util/dwarf-regs.c @@ -5,6 +5,8 @@ * Written by: Masami Hiramatsu */ +#include +#include #include #include #include @@ -68,3 +70,34 @@ const char *get_dwarf_regstr(unsigned int n, unsigned int machine) } return NULL; } + +__weak int get_arch_regnum(const char *name __maybe_unused) +{ + return -1; +} + +/* Return DWARF register number from architecture register name */ +int get_dwarf_regnum(const char *name, unsigned int machine) +{ + char *regname = strdup(name); + int reg = -1; + char *p; + + if (regname == NULL) + return -1; + + /* For convenience, remove trailing characters */ + p = strpbrk(regname, " ,)"); + if (p) + *p = '\0'; + + switch (machine) { + case EM_NONE: /* Generic arch - use host arch */ + reg = get_arch_regnum(regname); + break; + default: + pr_err("ELF MACHINE %x is not supported.\n", machine); + } + free(regname); + return reg; +} diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h index 7d99a084e82d..b515f694f55e 100644 --- a/tools/perf/util/include/dwarf-regs.h +++ b/tools/perf/util/include/dwarf-regs.h @@ -2,6 +2,9 @@ #ifndef _PERF_DWARF_REGS_H_ #define _PERF_DWARF_REGS_H_ +#define DWARF_REG_PC 0xd3af9c /* random number */ +#define DWARF_REG_FB 0xd3affb /* random number */ + #ifdef HAVE_DWARF_SUPPORT const char *get_arch_regstr(unsigned int n); /* @@ -10,6 +13,14 @@ const char *get_arch_regstr(unsigned int n); * machine: ELF machine signature (EM_*) */ const char *get_dwarf_regstr(unsigned int n, unsigned int machine); + +int get_arch_regnum(const char *name); +/* + * get_dwarf_regnum - Returns DWARF regnum from register name + * name: architecture register name + * machine: ELF machine signature (EM_*) + */ +int get_dwarf_regnum(const char *name, unsigned int machine); #endif #ifdef HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET From patchwork Thu Oct 12 03:50:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418218 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F68BA38; Thu, 12 Oct 2023 03:51:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="chVFrLV1" Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF2C4E4; Wed, 11 Oct 2023 20:51:28 -0700 (PDT) Received: by mail-pl1-x62c.google.com with SMTP id d9443c01a7336-1c9bca1d96cso4681665ad.3; Wed, 11 Oct 2023 20:51:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082688; x=1697687488; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=joIOSqpc9II33WD29wSpcJoIAVdTr/6RN8dGhdgNwvk=; b=chVFrLV1NWa2YvdYOoeVJ9VIF93RWLbhNFpxJBrsajGV2Bmekp1GNIWWoMD8JmINyk GDQsgWHV+E0wsqEABTpkzYt8Ol4we1tOGYCKp1gejphlSusMeeHlGegLrxB0VkBMLnwv Pksx3TKNywlc+BDTfq95dvNah0ILtKDYB14OkAIEGGpKDuPNkHhcWxnIsz2MQHlS3d4f QyMV8hmFMScD8cZ30Ki9TuRrtR0wKS2fgw41r3mrK/rw9SmC1ct5vPbhD2ox6WLIWDbW +kvu7AkwTdlfZ2mHQgvDexVPSJ9twoI6AvTGKYaXLTMzg66yclyixM/HPL2d0/6NAZcT 4jLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082688; x=1697687488; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=joIOSqpc9II33WD29wSpcJoIAVdTr/6RN8dGhdgNwvk=; b=cDvYHcCrzu0CYTW9p+1Gst/q1a/iI2e3rc0lMxse5+f4+w2yz1GCF4Eq2qyL7Eh/op juOYN7bFWUPD29mwyz4u1OPGRwht+VVe2dDqiEPljzskPBsWS2Tu3aXqhwSPLVKZ4Cre uSpU82UWAK+rgV/4sD0D5hBlx/ghsI/vKsSRNY5u6FoxViYShSjlBFLrrKt4wrY6ifod /OUu7XtR+hlpoY/Af4BoHGG+odIFJR5q+y3Vdjd4vONS9tZ2yt85gu+6llTqZGBUEY73 LA+vtUPxyaJrXGaSN0AHYiNCJT6ynTMzidPYsBbhFlE7IwNMrAQbMNrul8rqoDqx+4eN lSTA== X-Gm-Message-State: AOJu0YxeI/dBvuz6fEiKVNCzqg/XZs4elQFx3I8/IuYNF4kcO6sVNULG nfUBbBcZT89ZeayV/DxgnqE= X-Google-Smtp-Source: AGHT+IHalb1WJG4zXzhKmnak0iddX0Bstj3qdul4VTCIEj8EVCZxS4aVV6iIU+VyMBzv6NqBeSPmWQ== X-Received: by 2002:a17:903:4307:b0:1c9:dbd3:94f7 with SMTP id jz7-20020a170903430700b001c9dbd394f7mr1326213plb.65.1697082687898; Wed, 11 Oct 2023 20:51:27 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:27 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 10/48] perf annotate-data: Add find_data_type() Date: Wed, 11 Oct 2023 20:50:33 -0700 Message-ID: <20231012035111.676789-11-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The find_data_type() is to get a data type from the memory access at the given address (IP) using a register and an offset. It requires DWARF debug info in the DSO and searches the list of variables and function parameters in the scope. In a pseudo code, it does basically the following: find_data_type(dso, ip, reg, offset) { pc = map__rip_2objdump(ip); CU = dwarf_addrdie(dso->dwarf, pc); scopes = die_get_scopes(CU, pc); for_each_scope(S, scopes) { V = die_find_variable_by_reg(S, pc, reg); if (V && V.type == pointer_type) { T = die_get_real_type(V); if (offset < T.size) return T; } } return NULL; } Signed-off-by: Namhyung Kim --- tools/perf/util/Build | 1 + tools/perf/util/annotate-data.c | 163 ++++++++++++++++++++++++++++++++ tools/perf/util/annotate-data.h | 40 ++++++++ 3 files changed, 204 insertions(+) create mode 100644 tools/perf/util/annotate-data.c create mode 100644 tools/perf/util/annotate-data.h diff --git a/tools/perf/util/Build b/tools/perf/util/Build index a82122516720..cdc8a850859c 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -195,6 +195,7 @@ perf-$(CONFIG_DWARF) += probe-finder.o perf-$(CONFIG_DWARF) += dwarf-aux.o perf-$(CONFIG_DWARF) += dwarf-regs.o perf-$(CONFIG_DWARF) += debuginfo.o +perf-$(CONFIG_DWARF) += annotate-data.o perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o perf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind-local.o diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c new file mode 100644 index 000000000000..b3d519b7514b --- /dev/null +++ b/tools/perf/util/annotate-data.c @@ -0,0 +1,163 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Convert sample address to data type using DWARF debug info. + * + * Written by Namhyung Kim + */ + +#include +#include + +#include "annotate-data.h" +#include "debuginfo.h" +#include "debug.h" +#include "dso.h" +#include "map.h" +#include "map_symbol.h" +#include "strbuf.h" +#include "symbol.h" + +static bool find_cu_die(struct debuginfo *di, u64 pc, Dwarf_Die *cu_die) +{ + Dwarf_Off off, next_off; + size_t header_size; + + if (dwarf_addrdie(di->dbg, pc, cu_die) != NULL) + return cu_die; + + /* + * There are some kernels don't have full aranges and contain only a few + * aranges entries. Fallback to iterate all CU entries in .debug_info + * in case it's missing. + */ + off = 0; + while (dwarf_nextcu(di->dbg, off, &next_off, &header_size, + NULL, NULL, NULL) == 0) { + if (dwarf_offdie(di->dbg, off + header_size, cu_die) && + dwarf_haspc(cu_die, pc)) + return true; + + off = next_off; + } + return false; +} + +/* The type info will be saved in @type_die */ +static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset) +{ + Dwarf_Word size; + + /* Get the type of the variable */ + if (die_get_real_type(var_die, type_die) == NULL) { + pr_debug("variable has no type\n"); + return -1; + } + + /* + * It expects a pointer type for a memory access. + * Convert to a real type it points to. + */ + if (dwarf_tag(type_die) != DW_TAG_pointer_type || + die_get_real_type(type_die, type_die) == NULL) { + pr_debug("no pointer or no type\n"); + return -1; + } + + /* Get the size of the actual type */ + if (dwarf_aggregate_size(type_die, &size) < 0) { + pr_debug("type size is unknown\n"); + return -1; + } + + /* Minimal sanity check */ + if ((unsigned)offset >= size) { + pr_debug("offset: %d is bigger than size: %lu\n", offset, size); + return -1; + } + + return 0; +} + +/* The result will be saved in @type_die */ +static int find_data_type_die(struct debuginfo *di, u64 pc, + int reg, int offset, Dwarf_Die *type_die) +{ + Dwarf_Die cu_die, var_die; + Dwarf_Die *scopes = NULL; + int ret = -1; + int i, nr_scopes; + + /* Get a compile_unit for this address */ + if (!find_cu_die(di, pc, &cu_die)) { + pr_debug("cannot find CU for address %lx\n", pc); + return -1; + } + + /* Get a list of nested scopes - i.e. (inlined) functions and blocks. */ + nr_scopes = die_get_scopes(&cu_die, pc, &scopes); + + /* Search from the inner-most scope to the outer */ + for (i = nr_scopes - 1; i >= 0; i--) { + /* Look up variables/parameters in this scope */ + if (!die_find_variable_by_reg(&scopes[i], pc, reg, &var_die)) + continue; + + /* Found a variable, see if it's correct */ + ret = check_variable(&var_die, type_die, offset); + break; + } + + free(scopes); + return ret; +} + +/** + * find_data_type - Return a data type at the location + * @ms: map and symbol at the location + * @ip: instruction address of the memory access + * @reg: register that holds the base address + * @offset: offset from the base address + * + * This functions searches the debug information of the binary to get the data + * type it accesses. The exact location is expressed by (ip, reg, offset). + * It return %NULL if not found. + */ +struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, + int reg, int offset) +{ + struct annotated_data_type *result = NULL; + struct dso *dso = ms->map->dso; + struct debuginfo *di; + Dwarf_Die type_die; + struct strbuf sb; + u64 pc; + + di = debuginfo__new(dso->long_name); + if (di == NULL) { + pr_debug("cannot get the debug info\n"); + return NULL; + } + + /* + * IP is a relative instruction address from the start of the map, as + * it can be randomized/relocated, it needs to translate to PC which is + * a file address for DWARF processing. + */ + pc = map__rip_2objdump(ms->map, ip); + if (find_data_type_die(di, pc, reg, offset, &type_die) < 0) + goto out; + + result = zalloc(sizeof(*result)); + if (result == NULL) + goto out; + + strbuf_init(&sb, 32); + if (__die_get_typename(&type_die, &sb) < 0) + strbuf_add(&sb, "(unknown type)", 14); + + result->type_name = strbuf_detach(&sb, NULL); + +out: + debuginfo__delete(di); + return result; +} diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h new file mode 100644 index 000000000000..633147f78ca5 --- /dev/null +++ b/tools/perf/util/annotate-data.h @@ -0,0 +1,40 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _PERF_ANNOTATE_DATA_H +#define _PERF_ANNOTATE_DATA_H + +#include +#include +#include + +struct map_symbol; + +/** + * struct annotated_data_type - Data type to profile + * @type_name: Name of the data type + * @type_size: Size of the data type + * + * This represents a data type accessed by samples in the profile data. + */ +struct annotated_data_type { + char *type_name; + int type_size; +}; + +#ifdef HAVE_DWARF_SUPPORT + +/* Returns data type at the location (ip, reg, offset) */ +struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, + int reg, int offset); + +#else /* HAVE_DWARF_SUPPORT */ + +static inline struct annotated_data_type * +find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused, + int reg __maybe_unused, int offset __maybe_unused) +{ + return NULL; +} + +#endif /* HAVE_DWARF_SUPPORT */ + +#endif /* _PERF_ANNOTATE_DATA_H */ From patchwork Thu Oct 12 03:50:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418219 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F1B36A2C; Thu, 12 Oct 2023 03:51:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Yb86W2f0" Received: from mail-pg1-x52f.google.com (mail-pg1-x52f.google.com [IPv6:2607:f8b0:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E410FB; Wed, 11 Oct 2023 20:51:30 -0700 (PDT) Received: by mail-pg1-x52f.google.com with SMTP id 41be03b00d2f7-5859b2eaa55so389896a12.1; Wed, 11 Oct 2023 20:51:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082689; x=1697687489; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=0rVW03rOwjSZP2z69A2JhUmzH1YR4e60f99Uil62L7k=; b=Yb86W2f0tG1zUIP9ufxVG7NSuJ3nGR18BGE6RWUZOR03IQTZnH7V5uNIj403Hre9zs ET4/crZ4qirTi74U5gNuhTLrskrJY62zsWY/8xpzgQH3SFnBTDOXdpLlmLGzftL7I5IH wHrpwvLggxVhb2lDNmYvlNBx33JqLeFRVs38pxeENGUyBiewRdWhbnnC6CioWL4g6kxN b9y15KVxyNHIedAgpiRH0Kz411yGK3dgv6GliVCVMr4vq9wjozRuGXecxpd2v7QnUpj6 aek4Ngf+oA7J7S76e8pIXtMYEt6J/WVwN09Vp9eOVyVtT9K+XUFpSAmZZlZcj3k3rQoj ZduQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082689; x=1697687489; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=0rVW03rOwjSZP2z69A2JhUmzH1YR4e60f99Uil62L7k=; b=dZ7jog7OKA6/ZYtiKR/p8Ohb/R9H50J79wGOKBii83nqawI2ciTXbZUZGf98cLlUDt zkjuGHGHvSFv6MJxWQPPOpzZamtXH5aduMYSBTSXxAJbriD/KCbYl2vUXub4/O+sVNGV Q9VrdrIvUElTz3yZkgENCDLJK0KFOnUFGMESnk+DBkcnQu2TNFgUq8A6xy6lVR6SszHP p05kURyQUFVujJbc3uzB0tQoFLud96vN3mgjmmA8HjldNyh9J845pYGj2v8rS/XN9X1U 24wZ7O1RpkhUzksJce3zx96GG//KwSDuDaNfDv31JipKf7NNGiRD3tj+7eMpsvql/svA Jv2A== X-Gm-Message-State: AOJu0YzSstcd83qva3KiJuCPTcGf3F7Zoa1HR/G7s8wGEpPfVEl6jzho 1OqbNK9l1nmXt5HFbet448HXwhR6EqE= X-Google-Smtp-Source: AGHT+IH8ujh/4bbYJYVL//tpIwkrgIUKdJURj1ThDc//wfn3ffSDugZFYP9KcXN0vYrC738yiQcFwA== X-Received: by 2002:a17:903:110d:b0:1c7:37e2:13ff with SMTP id n13-20020a170903110d00b001c737e213ffmr23895591plh.6.1697082689403; Wed, 11 Oct 2023 20:51:29 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:29 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 11/48] perf annotate-data: Add dso->data_types tree Date: Wed, 11 Oct 2023 20:50:34 -0700 Message-ID: <20231012035111.676789-12-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net To aggregate accesses to the same data type, add 'data_types' tree in DSO to maintain data types and find it by name and size. It might have different data types that happen to have the same name. So it also compares the size of the type. Even if it doesn't 100% guarantee, it'd reduce the possiblility of mis-handling of such conflicts. And I don't think it's common to have different types with the same name. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 95 +++++++++++++++++++++++++++++---- tools/perf/util/annotate-data.h | 9 ++++ tools/perf/util/dso.c | 4 ++ tools/perf/util/dso.h | 2 + 4 files changed, 100 insertions(+), 10 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index b3d519b7514b..23381c0a5d38 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -17,6 +17,76 @@ #include "strbuf.h" #include "symbol.h" +/* + * Compare type name and size to maintain them in a tree. + * I'm not sure if DWARF would have information of a single type in many + * different places (compilation units). If not, it could compare the + * offset of the type entry in the .debug_info section. + */ +static int data_type_cmp(const void *_key, const struct rb_node *node) +{ + const struct annotated_data_type *key = _key; + struct annotated_data_type *type; + + type = rb_entry(node, struct annotated_data_type, node); + + if (key->type_size != type->type_size) + return key->type_size - type->type_size; + return strcmp(key->type_name, type->type_name); +} + +static bool data_type_less(struct rb_node *node_a, const struct rb_node *node_b) +{ + struct annotated_data_type *a, *b; + + a = rb_entry(node_a, struct annotated_data_type, node); + b = rb_entry(node_b, struct annotated_data_type, node); + + if (a->type_size != b->type_size) + return a->type_size < b->type_size; + return strcmp(a->type_name, b->type_name) < 0; +} + +static struct annotated_data_type *dso__findnew_data_type(struct dso *dso, + Dwarf_Die *type_die) +{ + struct annotated_data_type *result = NULL; + struct annotated_data_type key; + struct rb_node *node; + struct strbuf sb; + char *type_name; + Dwarf_Word size; + + strbuf_init(&sb, 32); + if (__die_get_typename(type_die, &sb) < 0) + strbuf_add(&sb, "(unknown type)", 14); + type_name = strbuf_detach(&sb, NULL); + dwarf_aggregate_size(type_die, &size); + + /* Check existing nodes in dso->data_types tree */ + key.type_name = type_name; + key.type_size = size; + node = rb_find(&key, &dso->data_types, data_type_cmp); + if (node) { + result = rb_entry(node, struct annotated_data_type, node); + free(type_name); + return result; + } + + /* If not, add a new one */ + result = zalloc(sizeof(*result)); + if (result == NULL) { + free(type_name); + return NULL; + } + + result->type_name = type_name; + result->type_size = size; + + rb_add(&result->node, &dso->data_types, data_type_less); + return result; +} + static bool find_cu_die(struct debuginfo *di, u64 pc, Dwarf_Die *cu_die) { Dwarf_Off off, next_off; @@ -129,7 +199,6 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, struct dso *dso = ms->map->dso; struct debuginfo *di; Dwarf_Die type_die; - struct strbuf sb; u64 pc; di = debuginfo__new(dso->long_name); @@ -147,17 +216,23 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, if (find_data_type_die(di, pc, reg, offset, &type_die) < 0) goto out; - result = zalloc(sizeof(*result)); - if (result == NULL) - goto out; - - strbuf_init(&sb, 32); - if (__die_get_typename(&type_die, &sb) < 0) - strbuf_add(&sb, "(unknown type)", 14); - - result->type_name = strbuf_detach(&sb, NULL); + result = dso__findnew_data_type(dso, &type_die); out: debuginfo__delete(di); return result; } + +void annotated_data_type__tree_delete(struct rb_root *root) +{ + struct annotated_data_type *pos; + + while (!RB_EMPTY_ROOT(root)) { + struct rb_node *node = rb_first(root); + + rb_erase(node, root); + pos = rb_entry(node, struct annotated_data_type, node); + free(pos->type_name); + free(pos); + } +} diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index 633147f78ca5..ab9f187bd7f1 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -4,6 +4,7 @@ #include #include +#include #include struct map_symbol; @@ -16,6 +17,7 @@ struct map_symbol; * This represents a data type accessed by samples in the profile data. */ struct annotated_data_type { + struct rb_node node; char *type_name; int type_size; }; @@ -26,6 +28,9 @@ struct annotated_data_type { struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, int reg, int offset); +/* Release all data type information in the tree */ +void annotated_data_type__tree_delete(struct rb_root *root); + #else /* HAVE_DWARF_SUPPORT */ static inline struct annotated_data_type * @@ -35,6 +40,10 @@ find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused, return NULL; } +static inline void annotated_data_type__tree_delete(struct rb_root *root __maybe_unused) +{ +} + #endif /* HAVE_DWARF_SUPPORT */ #endif /* _PERF_ANNOTATE_DATA_H */ diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c index 1f629b6fb7cf..22fd5fa806ed 100644 --- a/tools/perf/util/dso.c +++ b/tools/perf/util/dso.c @@ -31,6 +31,7 @@ #include "debug.h" #include "string2.h" #include "vdso.h" +#include "annotate-data.h" static const char * const debuglink_paths[] = { "%.0s%s", @@ -1327,6 +1328,7 @@ struct dso *dso__new_id(const char *name, struct dso_id *id) dso->data.cache = RB_ROOT; dso->inlined_nodes = RB_ROOT_CACHED; dso->srclines = RB_ROOT_CACHED; + dso->data_types = RB_ROOT; dso->data.fd = -1; dso->data.status = DSO_DATA_STATUS_UNKNOWN; dso->symtab_type = DSO_BINARY_TYPE__NOT_FOUND; @@ -1370,6 +1372,8 @@ void dso__delete(struct dso *dso) symbols__delete(&dso->symbols); dso->symbol_names_len = 0; zfree(&dso->symbol_names); + annotated_data_type__tree_delete(&dso->data_types); + if (dso->short_name_allocated) { zfree((char **)&dso->short_name); dso->short_name_allocated = false; diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h index 3759de8c2267..ce9f3849a773 100644 --- a/tools/perf/util/dso.h +++ b/tools/perf/util/dso.h @@ -154,6 +154,8 @@ struct dso { size_t symbol_names_len; struct rb_root_cached inlined_nodes; struct rb_root_cached srclines; + struct rb_root data_types; + struct { u64 addr; struct symbol *symbol; From patchwork Thu Oct 12 03:50:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418220 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6F59A2C; Thu, 12 Oct 2023 03:51:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SGXUrJOW" Received: from mail-pl1-x62d.google.com (mail-pl1-x62d.google.com [IPv6:2607:f8b0:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8ADD5111; Wed, 11 Oct 2023 20:51:33 -0700 (PDT) Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1c9a1762b43so4586725ad.1; Wed, 11 Oct 2023 20:51:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082691; x=1697687491; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=DbaN0yRuaza8BL6icLDagknshS4zfy6sspTNGP7mdKc=; b=SGXUrJOW/DMwXuopDy0AWGqPqDVkwNDtGo99RtyLx17yqtplD0hGQfo1pYmzEuu3wQ 2jzyFGLorTIp/HaavUAWee+JYvpkOlVS9MG5UyHPwobOEvSLYeXEYwCII9Gsg3FvVGbe j8/SBNx63+MzHca1T83ZBlXwAfJ8E4CyrotzSX4UZmYaeS658RD6qXemYsHzjlKyZ4yH kQObvA1OfzUmhYFRD4apoxOjikQew2ut5sfJwt4jGfhyvxeG3gz8fuQNByealAZzO2cB tmgIuDCsNSgZyMoGjReVxOUZyHZrKc3X4wwBSnOvi67sustvhdxA42kGOFI5UUOQL/sd 0nLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082691; x=1697687491; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=DbaN0yRuaza8BL6icLDagknshS4zfy6sspTNGP7mdKc=; b=Lpd7lbB0ZfYauvazvZWpCOh0KP4OtvWfo4QRMZchaLZUajmNt0txPCN2qZBbPqfhyU fybAv1Nt9d2XEkq/sBxyL6OH/o6y/1qg/oTBXpuiHR7dKYfdknzNszhbnzeWjL0qHriz /TuRlaZvnVUcU1+vbIEF+k0lauBp9Q4925p7YPNepG5yoFXZw/pDRoKTXp2eiF/38bDq U5dSf+hnh7+HpZdbXNSkNotjkWq7okcS9mYngk/zrsBKu1QpXtEHLMfr7PBa/TxGdedL u176lxbMSzih8N6dxEYH00AsWPAVzhm+xeFvKDGMHPmdD/ODcSQYOdTqo7TMwGQ8yFrx Z0eQ== X-Gm-Message-State: AOJu0Ywpe10DmDOyyE7o8PApIzW22ZdWEJ2V3ap48uXlwSOJ/CHneaZI 5HD+/faZjm0kGWOtj3QgS/Y= X-Google-Smtp-Source: AGHT+IEheQbgSpqZROi/n4uDdh6jZhcA+SuMT7pxBEGYErtpNcfwWcfp7IM0f+AV2I97ngR3oxSuCw== X-Received: by 2002:a17:902:ec91:b0:1c8:9d32:3392 with SMTP id x17-20020a170902ec9100b001c89d323392mr16052935plg.63.1697082690648; Wed, 11 Oct 2023 20:51:30 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:30 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 12/48] perf annotate: Factor out evsel__get_arch() Date: Wed, 11 Oct 2023 20:50:35 -0700 Message-ID: <20231012035111.676789-13-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The evsel__get_arch() is to get architecture info from the environ. It'll be used by other places later so let's factor it out. Also add arch__is() to check the arch info by name. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate.c | 44 +++++++++++++++++++++++++++----------- tools/perf/util/annotate.h | 2 ++ 2 files changed, 33 insertions(+), 13 deletions(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 605298410ed4..254cc9f224f4 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -840,6 +840,11 @@ static struct arch *arch__find(const char *name) return bsearch(name, architectures, nmemb, sizeof(struct arch), arch__key_cmp); } +bool arch__is(struct arch *arch, const char *name) +{ + return !strcmp(arch->name, name); +} + static struct annotated_source *annotated_source__new(void) { struct annotated_source *src = zalloc(sizeof(*src)); @@ -2344,15 +2349,8 @@ void symbol__calc_percent(struct symbol *sym, struct evsel *evsel) annotation__calc_percent(notes, evsel, symbol__size(sym)); } -int symbol__annotate(struct map_symbol *ms, struct evsel *evsel, - struct annotation_options *options, struct arch **parch) +static int evsel__get_arch(struct evsel *evsel, struct arch **parch) { - struct symbol *sym = ms->sym; - struct annotation *notes = symbol__annotation(sym); - struct annotate_args args = { - .evsel = evsel, - .options = options, - }; struct perf_env *env = evsel__env(evsel); const char *arch_name = perf_env__arch(env); struct arch *arch; @@ -2361,23 +2359,43 @@ int symbol__annotate(struct map_symbol *ms, struct evsel *evsel, if (!arch_name) return errno; - args.arch = arch = arch__find(arch_name); + *parch = arch = arch__find(arch_name); if (arch == NULL) { pr_err("%s: unsupported arch %s\n", __func__, arch_name); return ENOTSUP; } - if (parch) - *parch = arch; - if (arch->init) { err = arch->init(arch, env ? env->cpuid : NULL); if (err) { - pr_err("%s: failed to initialize %s arch priv area\n", __func__, arch->name); + pr_err("%s: failed to initialize %s arch priv area\n", + __func__, arch->name); return err; } } + return 0; +} + +int symbol__annotate(struct map_symbol *ms, struct evsel *evsel, + struct annotation_options *options, struct arch **parch) +{ + struct symbol *sym = ms->sym; + struct annotation *notes = symbol__annotation(sym); + struct annotate_args args = { + .evsel = evsel, + .options = options, + }; + struct arch *arch = NULL; + int err; + + err = evsel__get_arch(evsel, &arch); + if (err < 0) + return err; + + if (parch) + *parch = arch; + args.arch = arch; args.ms = *ms; if (notes->options && notes->options->full_addr) notes->start = map__objdump_2mem(ms->map, ms->sym->start); diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index e33a55431bad..c74f8f10f705 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -61,6 +61,8 @@ struct ins_operands { struct arch; +bool arch__is(struct arch *arch, const char *name); + struct ins_ops { void (*free)(struct ins_operands *ops); int (*parse)(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms); From patchwork Thu Oct 12 03:50:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418222 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 237A5A3C; Thu, 12 Oct 2023 03:51:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="fh/U7yOU" Received: from mail-yw1-x1132.google.com (mail-yw1-x1132.google.com [IPv6:2607:f8b0:4864:20::1132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A5F08116; Wed, 11 Oct 2023 20:51:33 -0700 (PDT) Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-5a7d532da4bso6895567b3.2; Wed, 11 Oct 2023 20:51:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082692; x=1697687492; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=wwugeUz2Kizrt8bo2ssr4th1+PIs+ybthYkaJJ9iBxc=; b=fh/U7yOUJdScqGCTsxHDN6EaC7RXG4Di7DwztT/Vy8dYflmO6SJcSkrFQXnpDH1oGl nmdjP1LgWaO2EVjF70TBALpFbEOlhYiGnwMpvfzXgPTVuasbFK8lUaF4rDHXimvEM1ui F2jEtu8j8F0GwWPKtkzF3F7AouMZ+KDkJomVCNnrO3YNdV0UUWo6LQlV2gbTLxIYH55j bQBK0mOF+jVqOs806CMlr2P3nkDlUzMnrooiR+KsYXla9nqppo+lRj9Xjhrrq0lrCGpi FBMTHdGwHjkPYBBal2mCNfeMYW4uMRFz72sI76GCf4ZC28DOwSCe+J8Zoyip6z7c1znr XKfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082692; x=1697687492; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=wwugeUz2Kizrt8bo2ssr4th1+PIs+ybthYkaJJ9iBxc=; b=SyEYSzBTWVdzKK+cNSJmNXnLrEoNaiVKJMgZln/5bbgS8wSYVm0SqxmdKuWqVKlj0h LVZv/TWpm3DCT76b1hCpUhVqBnab5IxcArR7uO2cqvAVz0yuCAizZs/bIl0eqlXTtAM8 pYAU0XG7VbWTpwfrTV7erVK4+RhtcW+NHgZ0JH+0xP6UYmn98cKjSaR27LK+ePr/QbRV LlT/LlZtdH5EN9dHiAgW13DcxGWhyZ6UTusEpezzzsK3wxWVh8/bc07xEZgniX7P45nw 6NTu4t9FehMsVJ9rWMqdh8qybgv95ulbu86sp/q76Oym56zk/Sd3NYV9LvuuV2VW4P5c e5Zw== X-Gm-Message-State: AOJu0YwYh34OLfFo0eRsqg4t8AN6ap1xHPmM29/KF0uG4QkD6XIJRZYU RrWpXRDzUhXqDNTeQPdZXgU= X-Google-Smtp-Source: AGHT+IHylZn0o9l2x1CECps0Fs/T61+2AgZHx3X6wLdIXl9pwLIFxKlIo0r01FrU6L/p+JpPh+pnuA== X-Received: by 2002:a05:690c:3744:b0:5a7:d4a2:cd13 with SMTP id fw4-20020a05690c374400b005a7d4a2cd13mr5373990ywb.8.1697082692066; Wed, 11 Oct 2023 20:51:32 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:31 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 13/48] perf annotate: Add annotate_get_insn_location() Date: Wed, 11 Oct 2023 20:50:36 -0700 Message-ID: <20231012035111.676789-14-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The annotate_get_insn_location() is to get the detailed information of instruction locations like registers and offset. It has source and target operands locations in an array. Each operand can have a register and an offset. The offset is meaningful when mem_ref flag is set. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate.c | 107 +++++++++++++++++++++++++++++++++++++ tools/perf/util/annotate.h | 36 +++++++++++++ 2 files changed, 143 insertions(+) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 254cc9f224f4..9d653a1e84ce 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -31,6 +31,7 @@ #include "bpf-utils.h" #include "block-range.h" #include "string2.h" +#include "dwarf-regs.h" #include "util/event.h" #include "util/sharded_mutex.h" #include "arch/common.h" @@ -3484,3 +3485,109 @@ int annotate_check_args(struct annotation_options *args) } return 0; } + +/* + * Get register number and access offset from the given instruction. + * It assumes AT&T x86 asm format like OFFSET(REG). Maybe it needs + * to revisit the format when it handles different architecture. + * Fills @reg and @offset when return 0. + */ +static int extract_reg_offset(struct arch *arch, const char *str, + struct annotated_op_loc *op_loc) +{ + char *p; + char *regname; + + if (arch->objdump.register_char == 0) + return -1; + + /* + * It should start from offset, but it's possible to skip 0 + * in the asm. So 0(%rax) should be same as (%rax). + * + * However, it also start with a segment select register like + * %gs:0x18(%rbx). In that case it should skip the part. + */ + if (*str == arch->objdump.register_char) { + while (*str && !isdigit(*str) && + *str != arch->objdump.memory_ref_char) + str++; + } + + op_loc->offset = strtol(str, &p, 0); + + p = strchr(p, arch->objdump.register_char); + if (p == NULL) + return -1; + + regname = strdup(p); + if (regname == NULL) + return -1; + + op_loc->reg = get_dwarf_regnum(regname, 0); + free(regname); + return 0; +} + +/** + * annotate_get_insn_location - Get location of instruction + * @arch: the architecture info + * @dl: the target instruction + * @loc: a buffer to save the data + * + * Get detailed location info (register and offset) in the instruction. + * It needs both source and target operand and whether it accesses a + * memory location. The offset field is meaningful only when the + * corresponding mem flag is set. + * + * Some examples on x86: + * + * mov (%rax), %rcx # src_reg = rax, src_mem = 1, src_offset = 0 + * # dst_reg = rcx, dst_mem = 0 + * + * mov 0x18, %r8 # src_reg = -1, dst_reg = r8 + */ +int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl, + struct annotated_insn_loc *loc) +{ + struct ins_operands *ops; + struct annotated_op_loc *op_loc; + int i; + + if (!strcmp(dl->ins.name, "lock")) + ops = dl->ops.locked.ops; + else + ops = &dl->ops; + + if (ops == NULL) + return -1; + + memset(loc, 0, sizeof(*loc)); + + for_each_insn_op_loc(loc, i, op_loc) { + const char *insn_str = ops->source.raw; + + if (i == INSN_OP_TARGET) + insn_str = ops->target.raw; + + /* Invalidate the register by default */ + op_loc->reg = -1; + + if (insn_str == NULL) + continue; + + if (strchr(insn_str, arch->objdump.memory_ref_char)) { + op_loc->mem_ref = true; + extract_reg_offset(arch, insn_str, op_loc); + } else { + char *s = strdup(insn_str); + + if (s) { + op_loc->reg = get_dwarf_regnum(s, 0); + free(s); + } + } + } + + return 0; +} diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index c74f8f10f705..4adda492233d 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -437,4 +437,40 @@ int annotate_parse_percent_type(const struct option *opt, const char *_str, int annotate_check_args(struct annotation_options *args); +/** + * struct annotated_op_loc - Location info of instruction operand + * @reg: Register in the operand + * @offset: Memory access offset in the operand + * @mem_ref: Whether the operand accesses memory + */ +struct annotated_op_loc { + int reg; + int offset; + bool mem_ref; +}; + +enum annotated_insn_ops { + INSN_OP_SOURCE = 0, + INSN_OP_TARGET = 1, + + INSN_OP_MAX, +}; + +/** + * struct annotated_insn_loc - Location info of instruction + * @ops: Array of location info for source and target operands + */ +struct annotated_insn_loc { + struct annotated_op_loc ops[INSN_OP_MAX]; +}; + +#define for_each_insn_op_loc(insn_loc, i, op_loc) \ + for (i = INSN_OP_SOURCE, op_loc = &(insn_loc)->ops[i]; \ + i < INSN_OP_MAX; \ + i++, op_loc++) + +/* Get detailed location info in the instruction */ +int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl, + struct annotated_insn_loc *loc); + #endif /* __PERF_ANNOTATE_H */ From patchwork Thu Oct 12 03:50:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418221 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C4689A38; Thu, 12 Oct 2023 03:51:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="C29ND8KE" Received: from mail-pj1-x1032.google.com (mail-pj1-x1032.google.com [IPv6:2607:f8b0:4864:20::1032]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19E1111A; Wed, 11 Oct 2023 20:51:34 -0700 (PDT) Received: by mail-pj1-x1032.google.com with SMTP id 98e67ed59e1d1-2773f776f49so384420a91.1; Wed, 11 Oct 2023 20:51:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082693; x=1697687493; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=XqS3acjNyfHXAKt2i/1MQ99UO2nZZIth1PMAMz3P27M=; b=C29ND8KEE5MDVIMfoJfdXWO59SmotPpaHeA47qxLOHhvXOnAZI3kMXWMxT9Hox/S5z XJS1s0wyWGkXy7bWHT0eUZgVc8JgjE30oXQ/k8ArVDwEgAzs+IA+QLLpj7NDWhimfsR8 ZjFzZTDKZf628b9AuNNI2S8oltRKJaucQxmoR7120Xd2Pi4kcSRKThWryEYKF3PC+RyZ Dtc9wdee8HOCgdcfQEnCXeGpxamj4x+tBAcRm5SpkaG/yiYiM9fB9u8ZAuPcIS7o6lo0 jkiaFfepENb4SxSuu+T5kANY+c8uRJu9oHrHbJEIYSkzZzWSrWd+3JyBwJodyvCEmqmq pe2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082693; x=1697687493; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=XqS3acjNyfHXAKt2i/1MQ99UO2nZZIth1PMAMz3P27M=; b=hRRa4OGA9lBWRDoA1/udlsv4DvUKAq1stiuyKPyJitaUnUJvDc/SQXaOBhqb13s9Rs WxGMaRy5oK5fV/vw8xXe55KhHM4CaQ74WnNYSqBhApo/dAwZ1WxsGA8Vna5ml89ttXBU mdKh1kkEQeWJBOCItTNIafnNGYq1oeNvdpplochoNqRFRzaJB0mlM66nAFRuBLraQXif 3Vy6zWa2CBheAm1qP3gsv9kMH9DagkjWmxhlhX0FbemE5ob3AEoAP6cmvwbexDcALOBa /XHKywz+1VPFtEg7/xtfiVG5sISzz3qRGTGGuND0zbNwxYZuwu9BUBvMowZpRzPtxsVe I7JA== X-Gm-Message-State: AOJu0YzjC191Ceqznwl3B/H+XW6vXaEy7URyIMfL7PHq/RA7CeIWNC7L WGiNPas7gv/geHfTzPgA/Y0= X-Google-Smtp-Source: AGHT+IHU2Vzhh9c+hkafnr864Gn2ZGcWbCgPbXYDp4+67dqafL09eG+KoUve+ApCUO/3uvOuZCOUIA== X-Received: by 2002:a17:90a:858b:b0:277:5cd5:6f80 with SMTP id m11-20020a17090a858b00b002775cd56f80mr28649847pjn.16.1697082693327; Wed, 11 Oct 2023 20:51:33 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:33 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 14/48] perf annotate: Implement hist_entry__get_data_type() Date: Wed, 11 Oct 2023 20:50:37 -0700 Message-ID: <20231012035111.676789-15-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net It's the function to find out the type info from the given sample data and will be called from the hist_entry sort logic when 'type' sort key is used. It first calls objdump to disassemble the instructions and figure out information about memory access at the location. Maybe we can do it better by analyzing the instruction directly, but I'll leave it for later work. The memory access is determined by checking instruction operands to have "(" and then extract register name and offset. It'll return NULL if no data type is found. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate.c | 85 ++++++++++++++++++++++++++++++++++++++ tools/perf/util/annotate.h | 4 ++ 2 files changed, 89 insertions(+) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 9d653a1e84ce..e5dc3d6fc6d0 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -25,6 +25,7 @@ #include "units.h" #include "debug.h" #include "annotate.h" +#include "annotate-data.h" #include "evsel.h" #include "evlist.h" #include "bpf-event.h" @@ -3591,3 +3592,87 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl, return 0; } + +static void symbol__ensure_annotate(struct map_symbol *ms, struct evsel *evsel) +{ + struct disasm_line *dl, *tmp_dl; + struct annotation *notes; + + notes = symbol__annotation(ms->sym); + if (!list_empty(¬es->src->source)) + return; + + if (symbol__annotate(ms, evsel, notes->options, NULL) < 0) + return; + + /* remove non-insn disasm lines for simplicity */ + list_for_each_entry_safe(dl, tmp_dl, ¬es->src->source, al.node) { + if (dl->al.offset == -1) { + list_del(&dl->al.node); + free(dl); + } + } +} + +static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip) +{ + struct disasm_line *dl; + struct annotation *notes; + + notes = symbol__annotation(sym); + + list_for_each_entry(dl, ¬es->src->source, al.node) { + if (sym->start + dl->al.offset == ip) + return dl; + } + return NULL; +} + +/** + * hist_entry__get_data_type - find data type for given hist entry + * @he: hist entry + * + * This function first annotates the instruction at @he->ip and extracts + * register and offset info from it. Then it searches the DWARF debug + * info to get a variable and type information using the address, register, + * and offset. + */ +struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) +{ + struct map_symbol *ms = &he->ms; + struct evsel *evsel = hists_to_evsel(he->hists); + struct arch *arch; + struct disasm_line *dl; + struct annotated_insn_loc loc; + struct annotated_op_loc *op_loc; + u64 ip = he->ip; + int i; + + if (ms->map == NULL || ms->sym == NULL) + return NULL; + + if (evsel__get_arch(evsel, &arch) < 0) + return NULL; + + /* Make sure it runs objdump to get disasm of the function */ + symbol__ensure_annotate(ms, evsel); + + /* + * Get a disasm to extract the location from the insn. + * This is too slow... + */ + dl = find_disasm_line(ms->sym, ip); + if (dl == NULL) + return NULL; + + if (annotate_get_insn_location(arch, dl, &loc) < 0) + return NULL; + + for_each_insn_op_loc(&loc, i, op_loc) { + if (!op_loc->mem_ref) + continue; + + return find_data_type(ms, ip, op_loc->reg, op_loc->offset); + } + return NULL; +} diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 4adda492233d..299b4a18e804 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -23,6 +23,7 @@ struct option; struct perf_sample; struct evsel; struct symbol; +struct annotated_data_type; struct ins { const char *name; @@ -473,4 +474,7 @@ struct annotated_insn_loc { int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl, struct annotated_insn_loc *loc); +/* Returns a data type from the sample instruction (if any) */ +struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he); + #endif /* __PERF_ANNOTATE_H */ From patchwork Thu Oct 12 03:50:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418224 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE95CEBC; Thu, 12 Oct 2023 03:51:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jTXr5cED" Received: from mail-pf1-x435.google.com (mail-pf1-x435.google.com [IPv6:2607:f8b0:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E54BD12C; Wed, 11 Oct 2023 20:51:37 -0700 (PDT) Received: by mail-pf1-x435.google.com with SMTP id d2e1a72fcca58-690bc3f82a7so465775b3a.0; Wed, 11 Oct 2023 20:51:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082694; x=1697687494; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=9Mm0EFKSyzZeq7wlW+V5DIgnESjW48EHe3H1udzSRfI=; b=jTXr5cEDneyU68rbb5CNcqO+exc4TsJZFY/p/3whJSNmvknj3+pHapCkOkA/OHYoGD 0xkbfyL29BVtoHIYmLHVkLxGas44IQwOO73o0gy+jYa+ebPO5z9Uqo8gzgXO1BFyOPtk 1Iw6b8Fh/NUG33XVkc0qzdzcfrVBM9SK0wqd5mSIA50EcarvHs3qQGvZE+/EgcIMQvzn WlVtAufXGDMn5jTwIpbHUoAyzjbKojcgVhMG3emX1akdIKD6F8Wikldx3Rwe9Ww7NJzz Vmos4ZY+0oPDXwZhAk+0s777poBZL2z2z4AmNsrlemL+iO0yV9438gp4bND7dYMdvFG/ EPeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082694; x=1697687494; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=9Mm0EFKSyzZeq7wlW+V5DIgnESjW48EHe3H1udzSRfI=; b=NO1Q7RqMWTh9Qy7DTMgaVDJnbDK+nkkVRR6VZKrntLStPyBkujyIjKLkxh9RMFeHPF /gkIFENsT28EOirtf+zwDTH+Eluxw8rsJ8SVj3/vREkrJ9cP5Fbx7flM2KL6nbNmJohM UDA/SVpEp39kAY0DOjtowDcLl18c4sCE4XW448M468n4nBOw/tIn5VZETQA6VSBzqEda mQhD0ieL4tH6y3GYvCrNBCRvt7TIldM8zhlSbti8nv9QTOOdO6M8PEFnDrOWVcBeB+X8 8u6bL+5+pljyhVQm7Ae3RZHzLfUsDG53BGvzWlGJ4cG1UfxFbjBrxyovqPrY/NXL0zeZ WXZw== X-Gm-Message-State: AOJu0YylcV8x9GCXGKki5jUrwoWvRvs7oZP71ST6MBegvVVfhq+PfAMF gmGXKYdY1JFAxKrtNZc4//g= X-Google-Smtp-Source: AGHT+IFcT2BsO5as9+E/OMNGu003MjUidyOdjBLI4kMSG8Rqcj5UofTcT31NaKmciu8TxK3TKE2I+w== X-Received: by 2002:a05:6a21:a5a3:b0:15b:c800:48af with SMTP id gd35-20020a056a21a5a300b0015bc80048afmr26689958pzc.23.1697082694569; Wed, 11 Oct 2023 20:51:34 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:34 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 15/48] perf report: Add 'type' sort key Date: Wed, 11 Oct 2023 20:50:38 -0700 Message-ID: <20231012035111.676789-16-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The 'type' sort key is to aggregate hist entries by data type they access. Add mem_type field to hist_entry struct to save the type. If hist_entry__get_data_type() returns NULL, it'd use the 'unknown_type' instance. Signed-off-by: Namhyung Kim --- tools/perf/Documentation/perf-report.txt | 1 + tools/perf/util/annotate-data.c | 5 ++ tools/perf/util/annotate-data.h | 2 + tools/perf/util/hist.h | 1 + tools/perf/util/sort.c | 65 +++++++++++++++++++++++- tools/perf/util/sort.h | 4 ++ 6 files changed, 76 insertions(+), 2 deletions(-) diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index af068b4f1e5a..aec34417090b 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -118,6 +118,7 @@ OPTIONS - retire_lat: On X86, this reports pipeline stall of this instruction compared to the previous instruction in cycles. And currently supported only on X86 - simd: Flags describing a SIMD operation. "e" for empty Arm SVE predicate. "p" for partial Arm SVE predicate + - type: Data type of sample memory access. By default, comm, dso and symbol keys are used. (i.e. --sort comm,dso,symbol) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 23381c0a5d38..3e3a561d73e3 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -17,6 +17,11 @@ #include "strbuf.h" #include "symbol.h" +/* Pseudo data types */ +struct annotated_data_type unknown_type = { + .type_name = (char *)"(unknown)", +}; + /* * Compare type name and size to maintain them in a tree. * I'm not sure if DWARF would have information of a single type in many diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index ab9f187bd7f1..6efdd7e21b28 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -22,6 +22,8 @@ struct annotated_data_type { int type_size; }; +extern struct annotated_data_type unknown_type; + #ifdef HAVE_DWARF_SUPPORT /* Returns data type at the location (ip, reg, offset) */ diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h index afc9f1c7f4dc..9bfed867f288 100644 --- a/tools/perf/util/hist.h +++ b/tools/perf/util/hist.h @@ -82,6 +82,7 @@ enum hist_column { HISTC_ADDR_TO, HISTC_ADDR, HISTC_SIMD, + HISTC_TYPE, HISTC_NR_COLS, /* Last entry */ }; diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index 6aa1c7f2b444..c79564c1d5df 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -24,6 +24,7 @@ #include "strbuf.h" #include "mem-events.h" #include "annotate.h" +#include "annotate-data.h" #include "event.h" #include "time-utils.h" #include "cgroup.h" @@ -2094,7 +2095,7 @@ struct sort_entry sort_dso_size = { .se_width_idx = HISTC_DSO_SIZE, }; -/* --sort dso_size */ +/* --sort addr */ static int64_t sort__addr_cmp(struct hist_entry *left, struct hist_entry *right) @@ -2131,6 +2132,65 @@ struct sort_entry sort_addr = { .se_width_idx = HISTC_ADDR, }; +/* --sort type */ + +static int64_t +sort__type_cmp(struct hist_entry *left, struct hist_entry *right) +{ + return sort__addr_cmp(left, right); +} + +static void sort__type_init(struct hist_entry *he) +{ + if (he->mem_type) + return; + + he->mem_type = hist_entry__get_data_type(he); + if (he->mem_type == NULL) + he->mem_type = &unknown_type; +} + +static int64_t +sort__type_collapse(struct hist_entry *left, struct hist_entry *right) +{ + struct annotated_data_type *left_type = left->mem_type; + struct annotated_data_type *right_type = right->mem_type; + + if (!left_type) { + sort__type_init(left); + left_type = left->mem_type; + } + + if (!right_type) { + sort__type_init(right); + right_type = right->mem_type; + } + + return strcmp(left_type->type_name, right_type->type_name); +} + +static int64_t +sort__type_sort(struct hist_entry *left, struct hist_entry *right) +{ + return sort__type_collapse(left, right); +} + +static int hist_entry__type_snprintf(struct hist_entry *he, char *bf, + size_t size, unsigned int width) +{ + return repsep_snprintf(bf, size, "%-*s", width, he->mem_type->type_name); +} + +struct sort_entry sort_type = { + .se_header = "Data Type", + .se_cmp = sort__type_cmp, + .se_collapse = sort__type_collapse, + .se_sort = sort__type_sort, + .se_init = sort__type_init, + .se_snprintf = hist_entry__type_snprintf, + .se_width_idx = HISTC_TYPE, +}; + struct sort_dimension { const char *name; @@ -2185,7 +2245,8 @@ static struct sort_dimension common_sort_dimensions[] = { DIM(SORT_ADDR, "addr", sort_addr), DIM(SORT_LOCAL_RETIRE_LAT, "local_retire_lat", sort_local_p_stage_cyc), DIM(SORT_GLOBAL_RETIRE_LAT, "retire_lat", sort_global_p_stage_cyc), - DIM(SORT_SIMD, "simd", sort_simd) + DIM(SORT_SIMD, "simd", sort_simd), + DIM(SORT_ANNOTATE_DATA_TYPE, "type", sort_type), }; #undef DIM diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h index ecfb7f1359d5..aabf0b8331a3 100644 --- a/tools/perf/util/sort.h +++ b/tools/perf/util/sort.h @@ -15,6 +15,7 @@ struct option; struct thread; +struct annotated_data_type; extern regex_t parent_regex; extern const char *sort_order; @@ -34,6 +35,7 @@ extern struct sort_entry sort_dso_to; extern struct sort_entry sort_sym_from; extern struct sort_entry sort_sym_to; extern struct sort_entry sort_srcline; +extern struct sort_entry sort_type; extern const char default_mem_sort_order[]; extern bool chk_double_cl; @@ -154,6 +156,7 @@ struct hist_entry { struct perf_hpp_list *hpp_list; struct hist_entry *parent_he; struct hist_entry_ops *ops; + struct annotated_data_type *mem_type; union { /* this is for hierarchical entry structure */ struct { @@ -243,6 +246,7 @@ enum sort_type { SORT_LOCAL_RETIRE_LAT, SORT_GLOBAL_RETIRE_LAT, SORT_SIMD, + SORT_ANNOTATE_DATA_TYPE, /* branch stack specific sort keys */ __SORT_BRANCH_STACK, From patchwork Thu Oct 12 03:50:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418223 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE935EA3; Thu, 12 Oct 2023 03:51:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kKTQafDw" Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6F32138; Wed, 11 Oct 2023 20:51:38 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1bdf4752c3cso4012275ad.2; Wed, 11 Oct 2023 20:51:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082696; x=1697687496; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=ZII5vF3dprwXJP49CiS8ixHpGtBs+wAGFWOW7yk/k14=; b=kKTQafDwBS3LnqLJJl/sGP0kPj0dFyOBIdszVkK/c2O2NsyGGINKG2UTmdGs5LyK0q DbrAzKI97ANciQ9bHroC3HygOoGshLomfSW6ohN3RIj2iQn24zWWBmiehttQ1HkWg24Z wR2xqRRF46EOmQ2FeklYlhbmWIVXuAH4wsfUQuf7UfzF8WZ4FC5nkOtV5KcEP3Fs3u5z vy0T369S9P++g3ZvqQuF0dbgjn/njnmSuBGsplqFcLANPntNA3mchl+8B0InJ0CMF4Os /Bt0RODOFS06ypD4T3hxCSQ3LoszGLIX2yvAgZ827FhrVARijcvvTz+VGxyMEfA5I60m BCng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082696; x=1697687496; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ZII5vF3dprwXJP49CiS8ixHpGtBs+wAGFWOW7yk/k14=; b=C4JwWdn4wVVKkSHJMuRHfY7MuX/Op71+9xPubXesNCeizBGlVnFPiH2GDEyG+cX1ed G97ceBTOzws+e3UYGrpXswelTycUG4BauNw8CjCdiZvUd2MHAoFsP+DC8efxBdG/BejH sNLK/XuwLJSmFMClTQ269Jpq1hzfz8AmZ+sCZGhr8LJQ1H4M9U0+Yk6vHppdz0qWrGtX umsAnXet5JHil0KS7airM/yAFVmManFDrAAzXAKhwO9fDRmq12HP5Y2c1dFxplQ4YtA8 gPTp0r9Irg1OGtiIQYDl/ydpe1HYRU3JNKdlk51JwU870uu3Yz6Rg1/qUe7bHPrH9G/m uEtg== X-Gm-Message-State: AOJu0YyCY1a3LgClmb59AeKdEjPFZiBMLg+RgKj2h6seVMl9aQVbxyOS 13GEA1RURYU7aSuFz+3s+EE= X-Google-Smtp-Source: AGHT+IEIgqdokKUdY0Jf5HR5a5Ezvauld/zyeGxQe5mfTKMuxdBz2qQI6SIiqrgYj4w0bmg90PJhgQ== X-Received: by 2002:a17:902:6bc5:b0:1c9:bef4:e11 with SMTP id m5-20020a1709026bc500b001c9bef40e11mr5717841plt.46.1697082695866; Wed, 11 Oct 2023 20:51:35 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:35 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 16/48] perf report: Support data type profiling Date: Wed, 11 Oct 2023 20:50:39 -0700 Message-ID: <20231012035111.676789-17-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Enable type annotation when the 'type' sort key is used. It shows type of variables the samples access at the moment. Users can see which types are accessed frequently. $ perf report -s dso,type --stdio ... # Overhead Shared Object Data Type # ........ ................. ......... # 35.47% [kernel.kallsyms] (unknown) 1.62% [kernel.kallsyms] struct sched_entry 1.23% [kernel.kallsyms] struct cfs_rq 0.83% [kernel.kallsyms] struct task_struct 0.34% [kernel.kallsyms] struct list_head 0.30% [kernel.kallsyms] struct mem_cgroup ... Signed-off-by: Namhyung Kim --- tools/perf/builtin-report.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index dcedfe00f04d..e60c6bb32d92 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -96,6 +96,7 @@ struct report { bool stitch_lbr; bool disable_order; bool skip_empty; + bool data_type; int max_stack; struct perf_read_values show_threads_values; struct annotation_options annotation_opts; @@ -171,7 +172,7 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter, struct mem_info *mi; struct branch_info *bi; - if (!ui__has_annotation() && !rep->symbol_ipc) + if (!ui__has_annotation() && !rep->symbol_ipc && !rep->data_type) return 0; if (sort__mode == SORT_MODE__BRANCH) { @@ -323,10 +324,19 @@ static int process_sample_event(struct perf_tool *tool, if (al.map != NULL) map__dso(al.map)->hit = 1; - if (ui__has_annotation() || rep->symbol_ipc || rep->total_cycles_mode) { + if (ui__has_annotation() || rep->symbol_ipc || rep->total_cycles_mode || + rep->data_type) { hist__account_cycles(sample->branch_stack, &al, sample, rep->nonany_branch_mode, &rep->total_cycles); + if (rep->data_type) { + struct symbol *sym = al.sym; + struct annotation *notes = sym ? symbol__annotation(sym) : NULL; + + /* XXX: Save annotate options here */ + if (notes) + notes->options = &rep->annotation_opts; + } } ret = hist_entry_iter__add(&iter, &al, rep->max_stack, rep); @@ -1600,6 +1610,9 @@ int cmd_report(int argc, const char **argv) sort_order = NULL; } + if (sort_order && strstr(sort_order, "type")) + report.data_type = true; + if (strcmp(input_name, "-") != 0) setup_browser(true); else @@ -1658,7 +1671,7 @@ int cmd_report(int argc, const char **argv) * so don't allocate extra space that won't be used in the stdio * implementation. */ - if (ui__has_annotation() || report.symbol_ipc || + if (ui__has_annotation() || report.symbol_ipc || report.data_type || report.total_cycles_mode) { ret = symbol__annotation_init(); if (ret < 0) From patchwork Thu Oct 12 03:50:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418225 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2EF03EBC; Thu, 12 Oct 2023 03:51:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="krSfokDD" Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B78D3185; Wed, 11 Oct 2023 20:51:39 -0700 (PDT) Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1c871a095ceso4550015ad.2; Wed, 11 Oct 2023 20:51:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082697; x=1697687497; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=l/1w/NdKs766XcCtqpVuOkRbnXHH9DRAZo4m/CjTbtY=; b=krSfokDDOcXRmjVI49Pbu9TJA2twsnhOVjSCVYbBQ8Ipanev75JKldUyeiYK+qS4ch ECg2X2LhaEy3G0jiJNIEcY94brRUryhQ8NoZu/2thEVSh3/zv3hTyRlB2/aWkdfKcDFC dZRHDUAgYA8rNjiuh3/pA2HKHH1TR1guW/p9r/QtpC0rsV2wKMhtyTF4QIcg/bgoIh0V M55pJNfGLA+71jCejUVctsBhQLFaO7RojvES9EiechX9cr82GBvjQUxcR9Hc89aZgvt1 xxjmPjUdqYYQ++rF3OlbmIjPJcRW4uQa9WZA18R+mQcWGgAL7Xf4vX3KUh8oeyK3hk5u ytIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082697; x=1697687497; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=l/1w/NdKs766XcCtqpVuOkRbnXHH9DRAZo4m/CjTbtY=; b=o4Wbtb9DVLa0fxVVxosoBAiT8lKKaw26LmnP4rdKoMfQFM6a4SgLzMOtPOIvF1Hoya Rt5mCA+BGWxFxjEloFUU3PPbBMsBX3wtRZeeygngKgls3ZUzAyyEzZ5xONAtlrF7S0Ag Soo40h+yfHmttYjHcfgvAJmifRF+I5ekhvHduy+aGLQ/SWGle5PdFrZwWWWlpDHrXafs LNLngeWRM+Wt5fCnjGaZr5vPGVyAGgMINqqRxXTjes/hE3HkAaxti8pK0HD5XWw6ZhTJ zcZwRM0QFRIZekazXLfDwqYoWC18oFaUFepR6AiaR+6gb3sDRgGY3KGM9O/vVNFcImfe LafQ== X-Gm-Message-State: AOJu0Yxkbzy6Z4pRMrW20q1p6iVaUEVQWNJmwM7GycFZIWx5YZMCDtXr 2kQq5SI8v7Sow6rfCaFYtO4= X-Google-Smtp-Source: AGHT+IF2d6T/3yhTEjm5f4jVYkGuQsMM+wvl5SxF88c1ATenxd3apUceICs/gkVGvC3XmEF3vC+oYA== X-Received: by 2002:a17:902:9a93:b0:1c8:8d9a:49d with SMTP id w19-20020a1709029a9300b001c88d9a049dmr13999009plp.47.1697082697296; Wed, 11 Oct 2023 20:51:37 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:36 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 17/48] perf annotate-data: Add member field in the data type Date: Wed, 11 Oct 2023 20:50:40 -0700 Message-ID: <20231012035111.676789-18-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Add child member field if the current type is a composite type like a struct or union. The member fields are linked in the children list and do the same recursively if the child itself is a composite type. Add 'self' member to the annotated_data_type to handle the members in the same way. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 106 ++++++++++++++++++++++++++++---- tools/perf/util/annotate-data.h | 27 ++++++-- tools/perf/util/sort.c | 4 +- 3 files changed, 119 insertions(+), 18 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 3e3a561d73e3..63205506b9fe 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -19,7 +19,10 @@ /* Pseudo data types */ struct annotated_data_type unknown_type = { - .type_name = (char *)"(unknown)", + .self = { + .type_name = (char *)"(unknown)", + .children = LIST_HEAD_INIT(unknown_type.self.children), + }, }; /* @@ -35,9 +38,9 @@ static int data_type_cmp(const void *_key, const struct rb_node *node) type = rb_entry(node, struct annotated_data_type, node); - if (key->type_size != type->type_size) - return key->type_size - type->type_size; - return strcmp(key->type_name, type->type_name); + if (key->self.size != type->self.size) + return key->self.size - type->self.size; + return strcmp(key->self.type_name, type->self.type_name); } static bool data_type_less(struct rb_node *node_a, const struct rb_node *node_b) @@ -47,9 +50,80 @@ static bool data_type_less(struct rb_node *node_a, const struct rb_node *node_b) a = rb_entry(node_a, struct annotated_data_type, node); b = rb_entry(node_b, struct annotated_data_type, node); - if (a->type_size != b->type_size) - return a->type_size < b->type_size; - return strcmp(a->type_name, b->type_name) < 0; + if (a->self.size != b->self.size) + return a->self.size < b->self.size; + return strcmp(a->self.type_name, b->self.type_name) < 0; +} + +/* Recursively add new members for struct/union */ +static int __add_member_cb(Dwarf_Die *die, void *arg) +{ + struct annotated_member *parent = arg; + struct annotated_member *member; + Dwarf_Die member_type, die_mem; + Dwarf_Word size, loc; + Dwarf_Attribute attr; + struct strbuf sb; + int tag; + + if (dwarf_tag(die) != DW_TAG_member) + return DIE_FIND_CB_SIBLING; + + member = zalloc(sizeof(*member)); + if (member == NULL) + return DIE_FIND_CB_END; + + strbuf_init(&sb, 32); + die_get_typename(die, &sb); + + die_get_real_type(die, &member_type); + if (dwarf_aggregate_size(&member_type, &size) < 0) + size = 0; + + if (!dwarf_attr_integrate(die, DW_AT_data_member_location, &attr)) + loc = 0; + else + dwarf_formudata(&attr, &loc); + + member->type_name = strbuf_detach(&sb, NULL); + /* member->var_name can be NULL */ + if (dwarf_diename(die)) + member->var_name = strdup(dwarf_diename(die)); + member->size = size; + member->offset = loc + parent->offset; + INIT_LIST_HEAD(&member->children); + list_add_tail(&member->node, &parent->children); + + tag = dwarf_tag(&member_type); + switch (tag) { + case DW_TAG_structure_type: + case DW_TAG_union_type: + die_find_child(&member_type, __add_member_cb, member, &die_mem); + break; + default: + break; + } + return DIE_FIND_CB_SIBLING; +} + +static void add_member_types(struct annotated_data_type *parent, Dwarf_Die *type) +{ + Dwarf_Die die_mem; + + die_find_child(type, __add_member_cb, &parent->self, &die_mem); +} + +static void delete_members(struct annotated_member *member) +{ + struct annotated_member *child, *tmp; + + list_for_each_entry_safe(child, tmp, &member->children, node) { + list_del(&child->node); + delete_members(child); + free(child->type_name); + free(child->var_name); + free(child); + } } static struct annotated_data_type *dso__findnew_data_type(struct dso *dso, @@ -69,8 +143,8 @@ static struct annotated_data_type *dso__findnew_data_type(struct dso *dso, dwarf_aggregate_size(type_die, &size); /* Check existing nodes in dso->data_types tree */ - key.type_name = type_name; - key.type_size = size; + key.self.type_name = type_name; + key.self.size = size; node = rb_find(&key, &dso->data_types, data_type_cmp); if (node) { result = rb_entry(node, struct annotated_data_type, node); @@ -85,8 +159,15 @@ static struct annotated_data_type *dso__findnew_data_type(struct dso *dso, return NULL; } - result->type_name = type_name; - result->type_size = size; + result->self.type_name = type_name; + result->self.size = size; + INIT_LIST_HEAD(&result->self.children); + + /* + * Fill member info unconditionally for now, + * later perf annotate would need it. + */ + add_member_types(result, type_die); rb_add(&result->node, &dso->data_types, data_type_less); return result; @@ -237,7 +318,8 @@ void annotated_data_type__tree_delete(struct rb_root *root) rb_erase(node, root); pos = rb_entry(node, struct annotated_data_type, node); - free(pos->type_name); + delete_members(&pos->self); + free(pos->self.type_name); free(pos); } } diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index 6efdd7e21b28..33748222e6aa 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -9,17 +9,36 @@ struct map_symbol; +/** + * struct annotated_member - Type of member field + * @node: List entry in the parent list + * @children: List head for child nodes + * @type_name: Name of the member type + * @var_name: Name of the member variable + * @offset: Offset from the outer data type + * @size: Size of the member field + * + * This represents a member type in a data type. + */ +struct annotated_member { + struct list_head node; + struct list_head children; + char *type_name; + char *var_name; + int offset; + int size; +}; + /** * struct annotated_data_type - Data type to profile - * @type_name: Name of the data type - * @type_size: Size of the data type + * @node: RB-tree node for dso->type_tree + * @self: Actual type information * * This represents a data type accessed by samples in the profile data. */ struct annotated_data_type { struct rb_node node; - char *type_name; - int type_size; + struct annotated_member self; }; extern struct annotated_data_type unknown_type; diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index c79564c1d5df..01300831333e 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -2166,7 +2166,7 @@ sort__type_collapse(struct hist_entry *left, struct hist_entry *right) right_type = right->mem_type; } - return strcmp(left_type->type_name, right_type->type_name); + return strcmp(left_type->self.type_name, right_type->self.type_name); } static int64_t @@ -2178,7 +2178,7 @@ sort__type_sort(struct hist_entry *left, struct hist_entry *right) static int hist_entry__type_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width) { - return repsep_snprintf(bf, size, "%-*s", width, he->mem_type->type_name); + return repsep_snprintf(bf, size, "%-*s", width, he->mem_type->self.type_name); } struct sort_entry sort_type = { From patchwork Thu Oct 12 03:50:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418227 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BEC8FEC4; Thu, 12 Oct 2023 03:51:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="X/S3vEoI" Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5FD9C18E; Wed, 11 Oct 2023 20:51:40 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1c9de3f66e5so3202605ad.3; Wed, 11 Oct 2023 20:51:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082698; x=1697687498; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=DbDdqJ+JUJe9Qdp9iUp/VIlsxYDDvTeRSoEAhVAMMI4=; b=X/S3vEoIukE4rmSJkWbpPUv4LSUwRtBcUyAnRifc9iUppowhDzVwyzef7BsZ8WiNYP NB9Un1UEoC5xFPOxHp8p1EQ9gZxjbhbFdl5Ix2Ll/emaD8SDsLROPO/iPJJZ2soGA9km u4BGTWzM3m7eBh9dLSXvoR7cEI6SBSR3Grg7xGr3bSKRmfnHcj2wNh1MgEICBEx3TIPx DwxBunCiHS1+9RMyntyGRpSI4vrRew6YGVwzwiSOcRdlhGLqZHbvEOljnygIXvyC6mEo xWR/bqzWQuKRfuX5GCfIm2R8LTFhaIxkWn43qz3N3f1iC06j5RQM+O1izA15NhKT0bAo Y6bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082698; x=1697687498; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=DbDdqJ+JUJe9Qdp9iUp/VIlsxYDDvTeRSoEAhVAMMI4=; b=maiw+rJKlwu9YeHXXaBI8eVBF6jK6dE0vagXfHMyuiJQJgfdb6kKqnP8I9EN8toL6z 4Pn5TdzCXNBMxq7bEH2yiyeUn3Qwb0HOtB+I+xPduUwSIM2E9fSSfpNQjsN9gsPJh0Sq GlFlmdvb/ISUSqepDFw2VDvDBI9z/Uwi4wMuEDFrOEgiOTLIVw+TwhYdiR9gVCjxJED7 ngiY7QAcNmHKYYn91XxxbRTHx8Wp1lJEV+iXSLeOgD7LJVxDg3oBXp4JBqJA8Rx1Tbko iYvSjBXuAlOm9Sv2NeHA7VIxa1LrALJ3rDrbw7Y4qmTkYibrAQS/X/h8VxANWPbOW/yI o8wg== X-Gm-Message-State: AOJu0YwSbTWg64x+qHPQJwyDd44/MZ6mgGOLfPoPQqr9HS3HUtQKe5jY mlxqxibCBK5h21vwPD3Zm+E= X-Google-Smtp-Source: AGHT+IGpi+TcAYZLFeKw3EN4+89W1pNtJto3tPl+MOGpTadU047nEO63AjGeDzUfKNzYYI3mbIYE6g== X-Received: by 2002:a17:902:e809:b0:1c0:6e92:8cc5 with SMTP id u9-20020a170902e80900b001c06e928cc5mr21221783plg.17.1697082698543; Wed, 11 Oct 2023 20:51:38 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:38 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 18/48] perf annotate-data: Update sample histogram for type Date: Wed, 11 Oct 2023 20:50:41 -0700 Message-ID: <20231012035111.676789-19-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The annotated_data_type__update_samples() to get histogram for data type access. It'll be called by perf annotate to show which fields in the data type are accessed frequently. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 81 +++++++++++++++++++++++++++++++++ tools/perf/util/annotate-data.h | 42 +++++++++++++++++ tools/perf/util/annotate.c | 9 +++- 3 files changed, 131 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 63205506b9fe..adeab45a3c63 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -12,6 +12,8 @@ #include "debuginfo.h" #include "debug.h" #include "dso.h" +#include "evsel.h" +#include "evlist.h" #include "map.h" #include "map_symbol.h" #include "strbuf.h" @@ -309,6 +311,44 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, return result; } +static int alloc_data_type_histograms(struct annotated_data_type *adt, int nr_entries) +{ + int i; + size_t sz = sizeof(struct type_hist); + + sz += sizeof(struct type_hist_entry) * adt->self.size; + + /* Allocate a table of pointers for each event */ + adt->nr_histograms = nr_entries; + adt->histograms = calloc(nr_entries, sizeof(*adt->histograms)); + if (adt->histograms == NULL) + return -ENOMEM; + + /* + * Each histogram is allocated for the whole size of the type. + * TODO: Probably we can move the histogram to members. + */ + for (i = 0; i < nr_entries; i++) { + adt->histograms[i] = zalloc(sz); + if (adt->histograms[i] == NULL) + goto err; + } + return 0; + +err: + while (--i >= 0) + free(adt->histograms[i]); + free(adt->histograms); + return -ENOMEM; +} + +static void delete_data_type_histograms(struct annotated_data_type *adt) +{ + for (int i = 0; i < adt->nr_histograms; i++) + free(adt->histograms[i]); + free(adt->histograms); +} + void annotated_data_type__tree_delete(struct rb_root *root) { struct annotated_data_type *pos; @@ -319,7 +359,48 @@ void annotated_data_type__tree_delete(struct rb_root *root) rb_erase(node, root); pos = rb_entry(node, struct annotated_data_type, node); delete_members(&pos->self); + delete_data_type_histograms(pos); free(pos->self.type_name); free(pos); } } + +/** + * annotated_data_type__update_samples - Update histogram + * @adt: Data type to update + * @evsel: Event to update + * @offset: Offset in the type + * @nr_samples: Number of samples at this offset + * @period: Event count at this offset + * + * This function updates type histogram at @ofs for @evsel. Samples are + * aggregated before calling this function so it can be called with more + * than one samples at a certain offset. + */ +int annotated_data_type__update_samples(struct annotated_data_type *adt, + struct evsel *evsel, int offset, + int nr_samples, u64 period) +{ + struct type_hist *h; + + if (adt == NULL) + return 0; + + if (adt->histograms == NULL) { + int nr = evsel->evlist->core.nr_entries; + + if (alloc_data_type_histograms(adt, nr) < 0) + return -1; + } + + if (offset < 0 || offset >= adt->self.size) + return -1; + + h = adt->histograms[evsel->core.idx]; + + h->nr_samples += nr_samples; + h->addr[offset].nr_samples += nr_samples; + h->period += period; + h->addr[offset].period += period; + return 0; +} diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index 33748222e6aa..d2dc025b1934 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -7,6 +7,7 @@ #include #include +struct evsel; struct map_symbol; /** @@ -29,16 +30,42 @@ struct annotated_member { int size; }; +/** + * struct type_hist_entry - Histogram entry per offset + * @nr_samples: Number of samples + * @period: Count of event + */ +struct type_hist_entry { + int nr_samples; + u64 period; +}; + +/** + * struct type_hist - Type histogram for each event + * @nr_samples: Total number of samples in this data type + * @period: Total count of the event in this data type + * @offset: Array of histogram entry + */ +struct type_hist { + u64 nr_samples; + u64 period; + struct type_hist_entry addr[]; +}; + /** * struct annotated_data_type - Data type to profile * @node: RB-tree node for dso->type_tree * @self: Actual type information + * @nr_histogram: Number of histogram entries + * @histograms: An array of pointers to histograms * * This represents a data type accessed by samples in the profile data. */ struct annotated_data_type { struct rb_node node; struct annotated_member self; + int nr_histograms; + struct type_hist **histograms; }; extern struct annotated_data_type unknown_type; @@ -49,6 +76,11 @@ extern struct annotated_data_type unknown_type; struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, int reg, int offset); +/* Update type access histogram at the given offset */ +int annotated_data_type__update_samples(struct annotated_data_type *adt, + struct evsel *evsel, int offset, + int nr_samples, u64 period); + /* Release all data type information in the tree */ void annotated_data_type__tree_delete(struct rb_root *root); @@ -61,6 +93,16 @@ find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused, return NULL; } +static inline int +annotated_data_type__update_samples(struct annotated_data_type *adt __maybe_unused, + struct evsel *evsel __maybe_unused, + int offset __maybe_unused, + int nr_samples __maybe_unused, + u64 period __maybe_unused) +{ + return -1; +} + static inline void annotated_data_type__tree_delete(struct rb_root *root __maybe_unused) { } diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index e5dc3d6fc6d0..ab942331720d 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3645,6 +3645,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) struct disasm_line *dl; struct annotated_insn_loc loc; struct annotated_op_loc *op_loc; + struct annotated_data_type *mem_type; u64 ip = he->ip; int i; @@ -3672,7 +3673,13 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) if (!op_loc->mem_ref) continue; - return find_data_type(ms, ip, op_loc->reg, op_loc->offset); + mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset); + + annotated_data_type__update_samples(mem_type, evsel, + op_loc->offset, + he->stat.nr_events, + he->stat.period); + return mem_type; } return NULL; } From patchwork Thu Oct 12 03:50:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418226 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46A33EC2; Thu, 12 Oct 2023 03:51:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UFaw8n6R" Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61D59192; Wed, 11 Oct 2023 20:51:41 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1c5bf7871dcso4611945ad.1; Wed, 11 Oct 2023 20:51:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082700; x=1697687500; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=GHmw97A1IWYhjKuWWW4L0EcNMYU+9RoJiaUerA0LE9w=; b=UFaw8n6RJO7CjSWokzA1IMHysCI1LIwdyiSzfr0VC1jQ+aQHFy5RXyFL/V9mKswVjA 29pcWaQKmbx436DJSuSLlgcngBKWuEGR8nLiOo4u5hkX5PcgO6VhH48npqK5gVS0YiQI SsJFug+vucbzzgis41RZElnNdagFLqzsO6qieMaU/aA3T66eZQuIXpn4JBcfQzYC3i6T o5c5lSb1jqCM453FpBXa6cokO6a8OSjVe1BIhBT3JdWuYovB2gZRBhlSsZRhRL9RjOQi 06J0z7IGk1QO8CIqACoVUXXZwrch05pINbN1BxwrxMpmSpl2y7TXjEafokim1WOOy5hk Y61A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082700; x=1697687500; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=GHmw97A1IWYhjKuWWW4L0EcNMYU+9RoJiaUerA0LE9w=; b=lprHItjvbqnrcaQ6rO3h7JFlB3iWdQXQ7DGG2WRpWJa0Fr1z0wz3+s3bSrwg/l7GZ5 TJJ/g34sz7CdM18ygH2v+21uwCUxs6d8mFUMvh/PQpubJ7mJLo/4aVxCaTtHCQAinhlv eopeQQZ3fPy22JLpAXTmZaMKTK/Lwo9ihKabcIYyoJT/uK06eGxBQC2tLtg/pkLGwbQZ YmLtdR2YDgXj5ymzNgKjd8TaDGWP0W+eW3liP5NjTWJ3XXkU9mNPhM8HxoTir8I9d1tJ JQXcO9kllzGaBt6whr1kIoPo3S4p/RIvgVZhC+0YdCUiJUUh5hlel+6ZHE14CjXirIET LPZA== X-Gm-Message-State: AOJu0Yx6F5zudyNy3fVaN33GLx3M7boaFa69Mi6NuvPJsnqbOeiTT90x c9qKj45IyzjwbdeXDCc8xas= X-Google-Smtp-Source: AGHT+IFT7BrXbhSOfJkEnU1yNeX4XrRXF4IzRri7PbcVmijktKEh3jTa/vXSUxvnQYUWzY/AXq30wg== X-Received: by 2002:a17:902:ecc5:b0:1c6:e8d:29ea with SMTP id a5-20020a170902ecc500b001c60e8d29eamr24453456plh.60.1697082699795; Wed, 11 Oct 2023 20:51:39 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:39 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 19/48] perf report: Add 'typeoff' sort key Date: Wed, 11 Oct 2023 20:50:42 -0700 Message-ID: <20231012035111.676789-20-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The typeoff sort key shows the data type name, offset and the name of the field. This is useful to see which field in the struct is accessed most frequently. $ perf report -s type,typeoff --hierarchy --stdio ... # Overhead Data Type / Data Type Offset # ............ ............................ # ... 1.23% struct cfs_rq 0.19% struct cfs_rq +404 (throttle_count) 0.19% struct cfs_rq +0 (load.weight) 0.19% struct cfs_rq +336 (leaf_cfs_rq_list.next) 0.09% struct cfs_rq +272 (propagate) 0.09% struct cfs_rq +196 (removed.nr) 0.09% struct cfs_rq +80 (curr) 0.09% struct cfs_rq +544 (lt_b_children_throttled) 0.06% struct cfs_rq +320 (rq) Signed-off-by: Namhyung Kim --- tools/perf/Documentation/perf-report.txt | 1 + tools/perf/util/annotate.c | 1 + tools/perf/util/hist.h | 1 + tools/perf/util/sort.c | 83 +++++++++++++++++++++++- tools/perf/util/sort.h | 2 + 5 files changed, 87 insertions(+), 1 deletion(-) diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index aec34417090b..b57eb51b47aa 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -119,6 +119,7 @@ OPTIONS to the previous instruction in cycles. And currently supported only on X86 - simd: Flags describing a SIMD operation. "e" for empty Arm SVE predicate. "p" for partial Arm SVE predicate - type: Data type of sample memory access. + - typeoff: Offset in the data type of sample memory access. By default, comm, dso and symbol keys are used. (i.e. --sort comm,dso,symbol) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index ab942331720d..49d5b61e19e6 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3679,6 +3679,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) op_loc->offset, he->stat.nr_events, he->stat.period); + he->mem_type_off = op_loc->offset; return mem_type; } return NULL; diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h index 9bfed867f288..941176afcebc 100644 --- a/tools/perf/util/hist.h +++ b/tools/perf/util/hist.h @@ -83,6 +83,7 @@ enum hist_column { HISTC_ADDR, HISTC_SIMD, HISTC_TYPE, + HISTC_TYPE_OFFSET, HISTC_NR_COLS, /* Last entry */ }; diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index 01300831333e..98eafef282df 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -2146,8 +2146,10 @@ static void sort__type_init(struct hist_entry *he) return; he->mem_type = hist_entry__get_data_type(he); - if (he->mem_type == NULL) + if (he->mem_type == NULL) { he->mem_type = &unknown_type; + he->mem_type_off = 0; + } } static int64_t @@ -2191,6 +2193,84 @@ struct sort_entry sort_type = { .se_width_idx = HISTC_TYPE, }; +/* --sort typeoff */ + +static int64_t +sort__typeoff_sort(struct hist_entry *left, struct hist_entry *right) +{ + struct annotated_data_type *left_type = left->mem_type; + struct annotated_data_type *right_type = right->mem_type; + int64_t ret; + + if (!left_type) { + sort__type_init(left); + left_type = left->mem_type; + } + + if (!right_type) { + sort__type_init(right); + right_type = right->mem_type; + } + + ret = strcmp(left_type->self.type_name, right_type->self.type_name); + if (ret) + return ret; + return left->mem_type_off - right->mem_type_off; +} + +static void fill_member_name(char *buf, size_t sz, struct annotated_member *m, + int offset, bool first) +{ + struct annotated_member *child; + + if (list_empty(&m->children)) + return; + + list_for_each_entry(child, &m->children, node) { + if (child->offset <= offset && offset < child->offset + child->size) { + int len = 0; + + /* It can have anonymous struct/union members */ + if (child->var_name) { + len = scnprintf(buf, sz, "%s%s", + first ? "" : ".", child->var_name); + first = false; + } + + fill_member_name(buf + len, sz - len, child, offset, first); + return; + } + } +} + +static int hist_entry__typeoff_snprintf(struct hist_entry *he, char *bf, + size_t size, unsigned int width __maybe_unused) +{ + struct annotated_data_type *he_type = he->mem_type; + char buf[4096]; + + buf[0] = '\0'; + if (list_empty(&he_type->self.children)) + snprintf(buf, sizeof(buf), "no field"); + else + fill_member_name(buf, sizeof(buf), &he_type->self, + he->mem_type_off, true); + buf[4095] = '\0'; + + return repsep_snprintf(bf, size, "%s %+d (%s)", he_type->self.type_name, + he->mem_type_off, buf); +} + +struct sort_entry sort_type_offset = { + .se_header = "Data Type Offset", + .se_cmp = sort__type_cmp, + .se_collapse = sort__typeoff_sort, + .se_sort = sort__typeoff_sort, + .se_init = sort__type_init, + .se_snprintf = hist_entry__typeoff_snprintf, + .se_width_idx = HISTC_TYPE_OFFSET, +}; + struct sort_dimension { const char *name; @@ -2247,6 +2327,7 @@ static struct sort_dimension common_sort_dimensions[] = { DIM(SORT_GLOBAL_RETIRE_LAT, "retire_lat", sort_global_p_stage_cyc), DIM(SORT_SIMD, "simd", sort_simd), DIM(SORT_ANNOTATE_DATA_TYPE, "type", sort_type), + DIM(SORT_ANNOTATE_DATA_TYPE_OFFSET, "typeoff", sort_type_offset), }; #undef DIM diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h index aabf0b8331a3..d806adcc1e1e 100644 --- a/tools/perf/util/sort.h +++ b/tools/perf/util/sort.h @@ -113,6 +113,7 @@ struct hist_entry { u64 p_stage_cyc; u8 cpumode; u8 depth; + int mem_type_off; struct simd_flags simd_flags; /* We are added by hists__add_dummy_entry. */ @@ -247,6 +248,7 @@ enum sort_type { SORT_GLOBAL_RETIRE_LAT, SORT_SIMD, SORT_ANNOTATE_DATA_TYPE, + SORT_ANNOTATE_DATA_TYPE_OFFSET, /* branch stack specific sort keys */ __SORT_BRANCH_STACK, From patchwork Thu Oct 12 03:50:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418228 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4C6DEEC2; Thu, 12 Oct 2023 03:51:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mDxpf5rr" Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F34B01A6; Wed, 11 Oct 2023 20:51:42 -0700 (PDT) Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1c871a095ceso4550275ad.2; Wed, 11 Oct 2023 20:51:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082701; x=1697687501; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=A4uGboMJlrvaP3U3hMpW0VIY9Dmref5aHpzwpfgI3EY=; b=mDxpf5rrWxTOp3ebB9sCQ//Ebi4iqYxePZMUwfWV6w/8poKc3f5g0X2xWerDdAu/og lvawdPXgpUQi4sdOsKdzvL62GSgKCWVH2LPZAVgp7TOQIz17B6yFMoz/TwEJ4NlNzY2M v63vS0Vq/tDDHu1PChCq9W6zTTFITkGtli4B0rUWHI2ZGkfgAsTeG95Q7PFddidyGNuz 9V/nSU2DHVz5qkSPrSuBibhL8c+cznltToFnLK+s9bq+0spiiiiLa93WJFVoJJqyJ3Ts K72MR5h14Bp6GPQB8mfL7v5bnHMuRPGM+SPONm5kcJrUtCAOUIWMc/2z+1sLg/qhhQR3 WQBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082701; x=1697687501; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=A4uGboMJlrvaP3U3hMpW0VIY9Dmref5aHpzwpfgI3EY=; b=Eudw4GN4u6O2HiH3/KA0Hf2lHNimHHMw12PtpBjI1l7+sI1DZGyjmxIAPPLdwiyj71 wcAEo6ZlKZi+1Hmb0SSHMZ19CMrxZ1/4A5SxUuDcUxmYETtyUBRXm2nJmRm8fni/A+HX /wwE4b2uXfNP8inaInBoiA5Uou2GMUCN3wt7mE8ZplTR3AcBjlRwWkcbDkW96g3f9lB5 RwZrGPfNHMQndxNQDMcY4gZ1GmAa0WR79U/8qw23Ued/vfd73kqhU/hUm9rKdhasr5zn 2zRv7hekoF5jetvPp7+WK43jZeXE29vcSfb7S4xs99HfcMQoQCyFMe+l9lrlcokvXXYe fJkg== X-Gm-Message-State: AOJu0YxDsl3ZuW+aV1ZrWiSVoxF5QFZ93OgrVvIkNELUA2bPXm9RbYkU JyiH2eQi9s7DsICG2cav0kc= X-Google-Smtp-Source: AGHT+IHXbNOmCQSjVGfizZp9XLZZ6ta0VI7LWSdJEJ2pe+anf6utVqesspw/pobJhAxnFvIlI4lpZQ== X-Received: by 2002:a17:903:234e:b0:1bc:5924:2da2 with SMTP id c14-20020a170903234e00b001bc59242da2mr22845946plh.56.1697082701083; Wed, 11 Oct 2023 20:51:41 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:40 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 20/48] perf report: Add 'symoff' sort key Date: Wed, 11 Oct 2023 20:50:43 -0700 Message-ID: <20231012035111.676789-21-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The symoff sort key is to print symbol and offset of sample. This is useful for data type profiling to show exact instruction in the function which refers the data. $ perf report -s type,sym,typeoff,symoff --hierarchy ... # Overhead Data Type / Symbol / Data Type Offset / Symbol Offset # .............. ..................................................... # 1.23% struct cfs_rq 0.84% update_blocked_averages 0.19% struct cfs_rq +336 (leaf_cfs_rq_list.next) 0.19% [k] update_blocked_averages+0x96 0.19% struct cfs_rq +0 (load.weight) 0.14% [k] update_blocked_averages+0x104 0.04% [k] update_blocked_averages+0x31c 0.17% struct cfs_rq +404 (throttle_count) 0.12% [k] update_blocked_averages+0x9d 0.05% [k] update_blocked_averages+0x1f9 0.08% struct cfs_rq +272 (propagate) 0.07% [k] update_blocked_averages+0x3d3 0.02% [k] update_blocked_averages+0x45b ... Signed-off-by: Namhyung Kim --- tools/perf/Documentation/perf-report.txt | 1 + tools/perf/util/hist.h | 1 + tools/perf/util/sort.c | 47 ++++++++++++++++++++++++ tools/perf/util/sort.h | 1 + 4 files changed, 50 insertions(+) diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index b57eb51b47aa..38f59ac064f7 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -120,6 +120,7 @@ OPTIONS - simd: Flags describing a SIMD operation. "e" for empty Arm SVE predicate. "p" for partial Arm SVE predicate - type: Data type of sample memory access. - typeoff: Offset in the data type of sample memory access. + - symoff: Offset in the symbol. By default, comm, dso and symbol keys are used. (i.e. --sort comm,dso,symbol) diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h index 941176afcebc..1ce0ee262abe 100644 --- a/tools/perf/util/hist.h +++ b/tools/perf/util/hist.h @@ -84,6 +84,7 @@ enum hist_column { HISTC_SIMD, HISTC_TYPE, HISTC_TYPE_OFFSET, + HISTC_SYMBOL_OFFSET, HISTC_NR_COLS, /* Last entry */ }; diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index 98eafef282df..e21bbd442637 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -419,6 +419,52 @@ struct sort_entry sort_sym = { .se_width_idx = HISTC_SYMBOL, }; +/* --sort symoff */ + +static int64_t +sort__symoff_cmp(struct hist_entry *left, struct hist_entry *right) +{ + int64_t ret; + + ret = sort__sym_cmp(left, right); + if (ret) + return ret; + + return left->ip - right->ip; +} + +static int64_t +sort__symoff_sort(struct hist_entry *left, struct hist_entry *right) +{ + int64_t ret; + + ret = sort__sym_sort(left, right); + if (ret) + return ret; + + return left->ip - right->ip; +} + +static int +hist_entry__symoff_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width) +{ + struct symbol *sym = he->ms.sym; + + if (sym == NULL) + return repsep_snprintf(bf, size, "[%c] %-#.*llx", he->level, width - 4, he->ip); + + return repsep_snprintf(bf, size, "[%c] %s+0x%llx", he->level, sym->name, he->ip - sym->start); +} + +struct sort_entry sort_sym_offset = { + .se_header = "Symbol Offset", + .se_cmp = sort__symoff_cmp, + .se_sort = sort__symoff_sort, + .se_snprintf = hist_entry__symoff_snprintf, + .se_filter = hist_entry__sym_filter, + .se_width_idx = HISTC_SYMBOL_OFFSET, +}; + /* --sort srcline */ char *hist_entry__srcline(struct hist_entry *he) @@ -2328,6 +2374,7 @@ static struct sort_dimension common_sort_dimensions[] = { DIM(SORT_SIMD, "simd", sort_simd), DIM(SORT_ANNOTATE_DATA_TYPE, "type", sort_type), DIM(SORT_ANNOTATE_DATA_TYPE_OFFSET, "typeoff", sort_type_offset), + DIM(SORT_SYM_OFFSET, "symoff", sort_sym_offset), }; #undef DIM diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h index d806adcc1e1e..6f6b4189a389 100644 --- a/tools/perf/util/sort.h +++ b/tools/perf/util/sort.h @@ -249,6 +249,7 @@ enum sort_type { SORT_SIMD, SORT_ANNOTATE_DATA_TYPE, SORT_ANNOTATE_DATA_TYPE_OFFSET, + SORT_SYM_OFFSET, /* branch stack specific sort keys */ __SORT_BRANCH_STACK, From patchwork Thu Oct 12 03:50:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418229 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF88CEBD; Thu, 12 Oct 2023 03:51:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Yc5nS6vA" Received: from mail-pg1-x536.google.com (mail-pg1-x536.google.com [IPv6:2607:f8b0:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7CECC1B2; Wed, 11 Oct 2023 20:51:43 -0700 (PDT) Received: by mail-pg1-x536.google.com with SMTP id 41be03b00d2f7-578e33b6fb7so380474a12.3; Wed, 11 Oct 2023 20:51:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082702; x=1697687502; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=EzzarU0Z4Z0jmC7pIDakAMpcrgTYGcG0+o/ody08WOk=; b=Yc5nS6vAHahHkopz8L1ivo7XRcLZ/xFlrTUyoHle+gqp+P+QHpc8rtxBygYGZMMwcH 6ojzt4qnu+8X/zfI+vP7lvIZixS1qUSAO/E/qessmisrSXBsusKRhiLqcdxf3r5u10ls nM22aevomtTNK5Q6+KvyB17Gcez6qfuGfHnahtwmDNvZ6Xez/2BqgqUg8StLxYuwe1J6 Fk2pHmDfRLV4WRRtkcg3Va68hQc/cVT+bsBQNrZkaVfQmVaWVlTlvwDRIfPdM/QUeKI2 WbdOMQxTgyDqiwaPsP03Eh/6J3/B3Xn8hbnuGNOVbmadmcMfojNlAI3UZ+FoGHcWYyVG PkEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082702; x=1697687502; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=EzzarU0Z4Z0jmC7pIDakAMpcrgTYGcG0+o/ody08WOk=; b=oJc7htKFivg5PHat6jD75jIhN7+7YDO0ulkNT5yq4bRH7/e/dLaSItVFmVdrFYIeYx NTD1EsJHfyqBN5VLBAfwyMD7Tc4jeITPuPTBgb8pmY+ny3qOvC/XabTH4ydkc4ZHE2mV /dyNmlnm7VW2pfqFml57xSbAfrc20a8OJmlbhXCyuTdPS6tG45bs5b3fugJaUxJUm2n8 OdpZWmMaX8tZoZIM9pfk0Q19eyRQok6Na2wX/yncVoSeuee0rWB1vw4GEr4w5cHvPqO1 1q59UqqVJvS7vps+8eTczobImVo3w3LKMocKIA6+NMznO/VCxZUscA5XVJr4GmPkvdYJ XJkw== X-Gm-Message-State: AOJu0Yx9GJJu97EfguRWI26TK9qbKt5ZLOaSTO5FP+kKmseT79FoT32F yG8VsxJa6Nti7vHDDNepFq8= X-Google-Smtp-Source: AGHT+IHx6Pj8xx72+mL0339BbM2hpIPuiXUgZ98E+U2RhBq0mFkTnrmoTIzG7//AggUp0LSYqNFnBQ== X-Received: by 2002:a17:903:124d:b0:1bf:d92e:c5a7 with SMTP id u13-20020a170903124d00b001bfd92ec5a7mr24609415plh.28.1697082702409; Wed, 11 Oct 2023 20:51:42 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:42 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 21/48] perf annotate: Add --data-type option Date: Wed, 11 Oct 2023 20:50:44 -0700 Message-ID: <20231012035111.676789-22-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Support data type annotation with new --data-type option. It internally uses type sort key to collect sample histogram for the type and display every members like below. $ perf annotate --data-type ... Annotate type: 'struct cfs_rq' in [kernel.kallsyms] (13 samples): ============================================================================ samples offset size field 13 0 640 struct cfs_rq { 2 0 16 struct load_weight load { 2 0 8 unsigned long weight; 0 8 4 u32 inv_weight; }; 0 16 8 unsigned long runnable_weight; 0 24 4 unsigned int nr_running; 1 28 4 unsigned int h_nr_running; ... For simplicity it prints the number of samples per field for now. But it should be easy to show the overhead percentage instead. The number at the outer struct is a sum of the numbers of the inner members. For example, struct cfs_rq got total 13 samples, and 2 came from the load (struct load_weight) and 1 from h_nr_running. Similarly, the struct load_weight got total 2 samples and they all came from the weight field. I've added two new flags in the symbol_conf for this. The annotate_data_member is to get the members of the type. This is also needed for perf report with typeoff sort key. The annotate_data_sample is to update sample stats for each offset and used only in annotate. Currently it only support stdio output mode, TUI support can be added later. Signed-off-by: Namhyung Kim --- tools/perf/builtin-annotate.c | 64 ++++++++++++++++++++++++++++++++- tools/perf/util/annotate-data.c | 8 ++--- tools/perf/util/annotate.c | 10 +++--- tools/perf/util/sort.c | 2 ++ tools/perf/util/symbol_conf.h | 4 ++- 5 files changed, 77 insertions(+), 11 deletions(-) diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c index aeeb801f1ed7..6be15a37d2b7 100644 --- a/tools/perf/builtin-annotate.c +++ b/tools/perf/builtin-annotate.c @@ -20,6 +20,7 @@ #include "util/evlist.h" #include "util/evsel.h" #include "util/annotate.h" +#include "util/annotate-data.h" #include "util/event.h" #include #include "util/parse-events.h" @@ -56,6 +57,7 @@ struct perf_annotate { bool skip_missing; bool has_br_stack; bool group_set; + bool data_type; float min_percent; const char *sym_hist_filter; const char *cpu_list; @@ -231,8 +233,12 @@ static int evsel__add_sample(struct evsel *evsel, struct perf_sample *sample, { struct hists *hists = evsel__hists(evsel); struct hist_entry *he; + struct annotation *notes = al->sym ? symbol__annotation(al->sym) : NULL; int ret; + if (notes) + notes->options = &ann->opts; + if ((!ann->has_br_stack || !has_annotation(ann)) && ann->sym_hist_filter != NULL && (al->sym == NULL || @@ -320,6 +326,32 @@ static int hist_entry__tty_annotate(struct hist_entry *he, return symbol__tty_annotate2(&he->ms, evsel, &ann->opts); } +static void print_annotated_data_type(struct annotated_data_type *mem_type, + struct annotated_member *member, + struct evsel *evsel, int indent) +{ + struct annotated_member *child; + struct type_hist *h = mem_type->histograms[evsel->core.idx]; + int i, samples = 0; + + for (i = 0; i < member->size; i++) + samples += h->addr[member->offset + i].nr_samples; + + printf(" %10d %10d %10d %*s%s\t%s", + samples, member->offset, member->size, indent, "", member->type_name, + member->var_name ?: ""); + + if (!list_empty(&member->children)) + printf(" {\n"); + + list_for_each_entry(child, &member->children, node) + print_annotated_data_type(mem_type, child, evsel, indent + 4); + + if (!list_empty(&member->children)) + printf("%*s}", 35 + indent, ""); + printf(";\n"); +} + static void hists__find_annotations(struct hists *hists, struct evsel *evsel, struct perf_annotate *ann) @@ -359,6 +391,23 @@ static void hists__find_annotations(struct hists *hists, continue; } + if (ann->data_type) { + struct map *map = he->ms.map; + + /* skip unknown type */ + if (he->mem_type->histograms == NULL) + goto find_next; + + printf("Annotate type: '%s' in %s (%d samples):\n", + he->mem_type->self.type_name, map->dso->name, he->stat.nr_events); + printf("============================================================================\n"); + printf(" %10s %10s %10s %s\n", "samples", "offset", "size", "field"); + + print_annotated_data_type(he->mem_type, &he->mem_type->self, evsel, 0); + printf("\n"); + goto find_next; + } + if (use_browser == 2) { int ret; int (*annotate)(struct hist_entry *he, @@ -606,6 +655,8 @@ int cmd_annotate(int argc, const char **argv) OPT_CALLBACK_OPTARG(0, "itrace", &itrace_synth_opts, NULL, "opts", "Instruction Tracing options\n" ITRACE_HELP, itrace_parse_synth_opts), + OPT_BOOLEAN(0, "data-type", &annotate.data_type, + "Show data type annotate for the memory accesses"), OPT_END() }; @@ -702,6 +753,14 @@ int cmd_annotate(int argc, const char **argv) use_browser = 2; #endif + /* FIXME: only support stdio for now */ + if (annotate.data_type) { + use_browser = 0; + annotate.opts.annotate_src = false; + symbol_conf.annotate_data_member = true; + symbol_conf.annotate_data_sample = true; + } + setup_browser(true); /* @@ -709,7 +768,10 @@ int cmd_annotate(int argc, const char **argv) * symbol, we do not care about the processes in annotate, * set sort order to avoid repeated output. */ - sort_order = "dso,symbol"; + if (annotate.data_type) + sort_order = "dso,type"; + else + sort_order = "dso,symbol"; /* * Set SORT_MODE__BRANCH so that annotate display IPC/Cycle diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index adeab45a3c63..ba7d35648b05 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -18,6 +18,7 @@ #include "map_symbol.h" #include "strbuf.h" #include "symbol.h" +#include "symbol_conf.h" /* Pseudo data types */ struct annotated_data_type unknown_type = { @@ -165,11 +166,8 @@ static struct annotated_data_type *dso__findnew_data_type(struct dso *dso, result->self.size = size; INIT_LIST_HEAD(&result->self.children); - /* - * Fill member info unconditionally for now, - * later perf annotate would need it. - */ - add_member_types(result, type_die); + if (symbol_conf.annotate_data_member) + add_member_types(result, type_die); rb_add(&result->node, &dso->data_types, data_type_less); return result; diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 49d5b61e19e6..3d9bb6b33e1a 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3675,10 +3675,12 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset); - annotated_data_type__update_samples(mem_type, evsel, - op_loc->offset, - he->stat.nr_events, - he->stat.period); + if (symbol_conf.annotate_data_sample) { + annotated_data_type__update_samples(mem_type, evsel, + op_loc->offset, + he->stat.nr_events, + he->stat.period); + } he->mem_type_off = op_loc->offset; return mem_type; } diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index e21bbd442637..35eb589c03ec 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -3394,6 +3394,8 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok, list->thread = 1; } else if (sd->entry == &sort_comm) { list->comm = 1; + } else if (sd->entry == &sort_type_offset) { + symbol_conf.annotate_data_member = true; } return __sort_dimension__add(sd, list, level); diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h index 0b589570d1d0..e6a1c48ca3bf 100644 --- a/tools/perf/util/symbol_conf.h +++ b/tools/perf/util/symbol_conf.h @@ -42,7 +42,9 @@ struct symbol_conf { inline_name, disable_add2line_warn, buildid_mmap2, - guest_code; + guest_code, + annotate_data_member, + annotate_data_sample; const char *vmlinux_name, *kallsyms_name, *source_prefix, From patchwork Thu Oct 12 03:50:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418230 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 60E84EC2; Thu, 12 Oct 2023 03:52:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SYBGsavk" Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 012A31BC; Wed, 11 Oct 2023 20:51:45 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1c7373cff01so13468115ad.1; Wed, 11 Oct 2023 20:51:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082704; x=1697687504; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=N7uHYDAA9VeGr5WawxBTNd1cNrO0i4L2JxTGBE9KuJA=; b=SYBGsavkf2j8fYuXqZ52WfAZmmA6uPPSK5+o0zp/0KdREleQfJjFWcZ0rYXTvkNlrj O6m4i6imbqmqJA1ZecH6LLyYnCZt+yzs4JK2tXNR8RGNcYObo3rUNQiBXeplluKVP1RC A+mNxzOhEu1q1ZNtkUIXlgzWGRvZkSN6qOFbJ32bmHqtGGjDgeY37LGrB7RPaiInNCJ2 1BDXOfHm9dnGP90WnQ6MvkAqAq9WCwoPb+tmexZQ0pF7Mr8wc2vEup32QkPl5yD8XWVN /k0O2BpJA/AU5jSvk7zm492dpb3L463B4vG+QI20oQDOD7Cay+cdrvXXYIJdBxqsF4Vz lDDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082704; x=1697687504; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=N7uHYDAA9VeGr5WawxBTNd1cNrO0i4L2JxTGBE9KuJA=; b=osKpf6jppDBBfhl19j+CufMfl5ll6NRztM18ebjI5VEnJNz9qpf7Bs65XDXr/io/KO uXbapO7XX1SVusDMZ3LtKa5hly5PJiY6y37seIL5l1r1elolM83kJTY8vq7WPmd4PPKE 4IdoPnkz0iv7zlGcEn/tdcejdp8ndyNBsjLvbz6O/f7w92gyaAmKs5OPvpSOA58KV5p8 Oo/mxLfCrRvs5KEE+587fh6cte4LAUNKOEo0LuExtrPgWmY+5Qazwt5WAuNSVLMXU8rr OaaHikyW26nNcdGwWZLpzHWaleUB6uTOnT/ky6jkaS8pO1xZIdvHbLkfmhpNwtaJBxem /CuQ== X-Gm-Message-State: AOJu0YxFbpLy6WhyAqQ0Dc81mNeZfz/L2w1EwGix/S6Fu4xheYIi/+5K rVfBbyh9FB9w/dcEviW7UOl/c5BBx0Q= X-Google-Smtp-Source: AGHT+IHeHSnw9TLEbJ0MI7Fq9W8cuetn3/gjN1gWOClQAo8ncAu6ZbymRdNheGi78nbxg0Q9OCX2zA== X-Received: by 2002:a17:902:a387:b0:1c9:c32e:c9a0 with SMTP id x7-20020a170902a38700b001c9c32ec9a0mr7750338pla.2.1697082703674; Wed, 11 Oct 2023 20:51:43 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:43 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 22/48] perf annotate: Add --type-stat option for debugging Date: Wed, 11 Oct 2023 20:50:45 -0700 Message-ID: <20231012035111.676789-23-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The --type-stat option is to be used with --data-type and to print detailed failure reasons for the data type annotation. $ perf annotate --data-type --type-stat Annotate data type stats: total 294, ok 116 (39.5%), bad 178 (60.5%) ----------------------------------------------------------- 30 : no_sym 40 : no_insn_ops 33 : no_mem_ops 63 : no_var 4 : no_typeinfo 8 : bad_offset Signed-off-by: Namhyung Kim --- tools/perf/builtin-annotate.c | 44 ++++++++++++++++++++++++++++++++- tools/perf/util/annotate-data.c | 13 +++++++++- tools/perf/util/annotate-data.h | 31 +++++++++++++++++++++++ tools/perf/util/annotate.c | 20 ++++++++++++--- 4 files changed, 102 insertions(+), 6 deletions(-) diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c index 6be15a37d2b7..645acaba63f1 100644 --- a/tools/perf/builtin-annotate.c +++ b/tools/perf/builtin-annotate.c @@ -58,6 +58,7 @@ struct perf_annotate { bool has_br_stack; bool group_set; bool data_type; + bool type_stat; float min_percent; const char *sym_hist_filter; const char *cpu_list; @@ -352,6 +353,43 @@ static void print_annotated_data_type(struct annotated_data_type *mem_type, printf(";\n"); } +static void print_annotate_data_stat(struct annotated_data_stat *s) +{ +#define PRINT_STAT(fld) if (s->fld) printf("%10d : %s\n", s->fld, #fld) + + int bad = s->no_sym + + s->no_insn + + s->no_insn_ops + + s->no_mem_ops + + s->no_reg + + s->no_dbginfo + + s->no_cuinfo + + s->no_var + + s->no_typeinfo + + s->invalid_size + + s->bad_offset; + int ok = s->total - bad; + + printf("Annotate data type stats:\n"); + printf("total %d, ok %d (%.1f%%), bad %d (%.1f%%)\n", + s->total, ok, 100.0 * ok / (s->total ?: 1), bad, 100.0 * bad / (s->total ?: 1)); + printf("-----------------------------------------------------------\n"); + PRINT_STAT(no_sym); + PRINT_STAT(no_insn); + PRINT_STAT(no_insn_ops); + PRINT_STAT(no_mem_ops); + PRINT_STAT(no_reg); + PRINT_STAT(no_dbginfo); + PRINT_STAT(no_cuinfo); + PRINT_STAT(no_var); + PRINT_STAT(no_typeinfo); + PRINT_STAT(invalid_size); + PRINT_STAT(bad_offset); + printf("\n"); + +#undef PRINT_STAT +} + static void hists__find_annotations(struct hists *hists, struct evsel *evsel, struct perf_annotate *ann) @@ -359,6 +397,9 @@ static void hists__find_annotations(struct hists *hists, struct rb_node *nd = rb_first_cached(&hists->entries), *next; int key = K_RIGHT; + if (ann->type_stat) + print_annotate_data_stat(&ann_data_stat); + while (nd) { struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node); struct annotation *notes; @@ -657,7 +698,8 @@ int cmd_annotate(int argc, const char **argv) itrace_parse_synth_opts), OPT_BOOLEAN(0, "data-type", &annotate.data_type, "Show data type annotate for the memory accesses"), - + OPT_BOOLEAN(0, "type-stat", &annotate.type_stat, + "Show stats for the data type annotation"), OPT_END() }; int ret; diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index ba7d35648b05..3e30e6855ba8 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -28,6 +28,9 @@ struct annotated_data_type unknown_type = { }, }; +/* Data type collection debug statistics */ +struct annotated_data_stat ann_data_stat; + /* * Compare type name and size to maintain them in a tree. * I'm not sure if DWARF would have information of a single type in many @@ -206,6 +209,7 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset) /* Get the type of the variable */ if (die_get_real_type(var_die, type_die) == NULL) { pr_debug("variable has no type\n"); + ann_data_stat.no_typeinfo++; return -1; } @@ -216,18 +220,21 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset) if (dwarf_tag(type_die) != DW_TAG_pointer_type || die_get_real_type(type_die, type_die) == NULL) { pr_debug("no pointer or no type\n"); + ann_data_stat.no_typeinfo++; return -1; } /* Get the size of the actual type */ if (dwarf_aggregate_size(type_die, &size) < 0) { pr_debug("type size is unknown\n"); + ann_data_stat.invalid_size++; return -1; } /* Minimal sanity check */ if ((unsigned)offset >= size) { pr_debug("offset: %d is bigger than size: %lu\n", offset, size); + ann_data_stat.bad_offset++; return -1; } @@ -246,6 +253,7 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, /* Get a compile_unit for this address */ if (!find_cu_die(di, pc, &cu_die)) { pr_debug("cannot find CU for address %lx\n", pc); + ann_data_stat.no_cuinfo++; return -1; } @@ -260,9 +268,12 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, /* Found a variable, see if it's correct */ ret = check_variable(&var_die, type_die, offset); - break; + goto out; } + if (ret < 0) + ann_data_stat.no_var++; +out: free(scopes); return ret; } diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index d2dc025b1934..8e73096c01d1 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -70,6 +70,37 @@ struct annotated_data_type { extern struct annotated_data_type unknown_type; +/** + * struct annotated_data_stat - Debug statistics + * @total: Total number of entry + * @no_sym: No symbol or map found + * @no_insn: Failed to get disasm line + * @no_insn_ops: The instruction has no operands + * @no_mem_ops: The instruction has no memory operands + * @no_reg: Failed to extract a register from the operand + * @no_dbginfo: The binary has no debug information + * @no_cuinfo: Failed to find a compile_unit + * @no_var: Failed to find a matching variable + * @no_typeinfo: Failed to get a type info for the variable + * @invalid_size: Failed to get a size info of the type + * @bad_offset: The access offset is out of the type + */ +struct annotated_data_stat { + int total; + int no_sym; + int no_insn; + int no_insn_ops; + int no_mem_ops; + int no_reg; + int no_dbginfo; + int no_cuinfo; + int no_var; + int no_typeinfo; + int invalid_size; + int bad_offset; +}; +extern struct annotated_data_stat ann_data_stat; + #ifdef HAVE_DWARF_SUPPORT /* Returns data type at the location (ip, reg, offset) */ diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 3d9bb6b33e1a..72b867001e22 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3649,11 +3649,17 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) u64 ip = he->ip; int i; - if (ms->map == NULL || ms->sym == NULL) + ann_data_stat.total++; + + if (ms->map == NULL || ms->sym == NULL) { + ann_data_stat.no_sym++; return NULL; + } - if (evsel__get_arch(evsel, &arch) < 0) + if (evsel__get_arch(evsel, &arch) < 0) { + ann_data_stat.no_insn++; return NULL; + } /* Make sure it runs objdump to get disasm of the function */ symbol__ensure_annotate(ms, evsel); @@ -3663,11 +3669,15 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) * This is too slow... */ dl = find_disasm_line(ms->sym, ip); - if (dl == NULL) + if (dl == NULL) { + ann_data_stat.no_insn++; return NULL; + } - if (annotate_get_insn_location(arch, dl, &loc) < 0) + if (annotate_get_insn_location(arch, dl, &loc) < 0) { + ann_data_stat.no_insn_ops++; return NULL; + } for_each_insn_op_loc(&loc, i, op_loc) { if (!op_loc->mem_ref) @@ -3684,5 +3694,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) he->mem_type_off = op_loc->offset; return mem_type; } + + ann_data_stat.no_mem_ops++; return NULL; } From patchwork Thu Oct 12 03:50:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418231 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9014D10F4; Thu, 12 Oct 2023 03:52:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WDlwVejZ" Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB018D50; Wed, 11 Oct 2023 20:51:46 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1c5c91bec75so4321435ad.3; Wed, 11 Oct 2023 20:51:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082705; x=1697687505; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=Dcl2dOAm+mbEonLbxuJ2mMnOeoBphEb+PfyBv0Awdhw=; b=WDlwVejZanP1p7v/Je9eoFuQjGwBEbAIGM6e5Zxkmgs2w6FxXmCKNIsOnL8g2E4zB4 CICLzgdC8ZITPF0qgs1ZuNePCSRui7VmI4+4A1QMp3GPsn0m/LemY1qoXXWvdj2PvpaY /BTsiyuz784+Ggy8E1b12n/zXxydfSl2u9YKZP4F02cPbZuBKEpPpMlcreiwERXmsX1c Q14OY6AOT8cvi22eJ6DW6gmJ0c+4SIoirbm5kQtTcUZyv+/K1wNcQjdJNLj+x4iR3Doo gqDSoyazgjLJgwF3K8pSiA0dL7hP29GkEtCLKxrkf8vzcO5EdKmoRVVyQdjxddUJNMJa 5ryA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082705; x=1697687505; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Dcl2dOAm+mbEonLbxuJ2mMnOeoBphEb+PfyBv0Awdhw=; b=qroVUgXp5r3yL5UegoUnJkDXqjqVWd5fbo44OiJIwLoFt8tc0ZSbsYJGqrSRkDTM+n HE+0ZeVA8rx+lCta5AHiF4kCUAB3e+9xbZmK5+gvOvGE5Z5HerT3E1x9oI3gVgrlSLv5 q9IPNTage73+Ax7jQzTcWHLFPDkoW2oLmNxFAwsCg5qiAGSc/w9asP87Ny+iT53dO8ke 7V9l9iQT/CJmzmgv37L/SDlyoA0h3KB6z36xJ7IdcROpFc0vln9MzOQoZfntb4EMgXe9 Pa8f1tcv2a6XycuQUOqvsIXhkKOp63buCFscM5/5TfoQ6el7V+5R9uhgwIJJQ2274dCF UZYg== X-Gm-Message-State: AOJu0Yyncpoz4Mn6cLO1dJRXqRkxLtl6PNeqblaY+CSrPKxCG5CJJECw m/MTp8+s5Ky8sjUOHkLg1X4= X-Google-Smtp-Source: AGHT+IGKxSbTaAe5dt8L//qarq34W5C8IjlG6rfr+SNYhEb5O98F/3GnjaSxQQVv5b7iM/nejaRXbA== X-Received: by 2002:a17:902:6941:b0:1bb:b855:db3c with SMTP id k1-20020a170902694100b001bbb855db3cmr15510847plt.41.1697082704944; Wed, 11 Oct 2023 20:51:44 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:44 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 23/48] perf annotate: Add --insn-stat option for debugging Date: Wed, 11 Oct 2023 20:50:46 -0700 Message-ID: <20231012035111.676789-24-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net This is for a debugging purpose. It'd be useful to see per-instrucion level success/failure stats. $ perf annotate --data-type --insn-stat Annotate Instruction stats total 264, ok 143 (54.2%), bad 121 (45.8%) Name : Good Bad ----------------------------------------------------------- movq : 45 31 movl : 22 11 popq : 0 19 cmpl : 16 3 addq : 8 7 cmpq : 11 3 cmpxchgl : 3 7 cmpxchgq : 8 0 incl : 3 3 movzbl : 4 2 incq : 4 2 decl : 6 0 ... Signed-off-by: Namhyung Kim --- tools/perf/builtin-annotate.c | 41 +++++++++++++++++++++++++++++++++++ tools/perf/util/annotate.c | 39 +++++++++++++++++++++++++++++++++ tools/perf/util/annotate.h | 8 +++++++ 3 files changed, 88 insertions(+) diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c index 645acaba63f1..a01d5e162466 100644 --- a/tools/perf/builtin-annotate.c +++ b/tools/perf/builtin-annotate.c @@ -59,6 +59,7 @@ struct perf_annotate { bool group_set; bool data_type; bool type_stat; + bool insn_stat; float min_percent; const char *sym_hist_filter; const char *cpu_list; @@ -390,6 +391,42 @@ static void print_annotate_data_stat(struct annotated_data_stat *s) #undef PRINT_STAT } +static void print_annotate_item_stat(struct list_head *head, const char *title) +{ + struct annotated_item_stat *istat, *pos, *iter; + int total_good, total_bad, total; + int sum1, sum2; + LIST_HEAD(tmp); + + /* sort the list by count */ + list_splice_init(head, &tmp); + total_good = total_bad = 0; + + list_for_each_entry_safe(istat, pos, &tmp, list) { + total_good += istat->good; + total_bad += istat->bad; + sum1 = istat->good + istat->bad; + + list_for_each_entry(iter, head, list) { + sum2 = iter->good + iter->bad; + if (sum1 > sum2) + break; + } + list_move_tail(&istat->list, &iter->list); + } + total = total_good + total_bad; + + printf("Annotate %s stats\n", title); + printf("total %d, ok %d (%.1f%%), bad %d (%.1f%%)\n\n", total, + total_good, 100.0 * total_good / (total ?: 1), + total_bad, 100.0 * total_bad / (total ?: 1)); + printf(" %-10s: %5s %5s\n", "Name", "Good", "Bad"); + printf("-----------------------------------------------------------\n"); + list_for_each_entry(istat, head, list) + printf(" %-10s: %5d %5d\n", istat->name, istat->good, istat->bad); + printf("\n"); +} + static void hists__find_annotations(struct hists *hists, struct evsel *evsel, struct perf_annotate *ann) @@ -399,6 +436,8 @@ static void hists__find_annotations(struct hists *hists, if (ann->type_stat) print_annotate_data_stat(&ann_data_stat); + if (ann->insn_stat) + print_annotate_item_stat(&ann_insn_stat, "Instruction"); while (nd) { struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node); @@ -700,6 +739,8 @@ int cmd_annotate(int argc, const char **argv) "Show data type annotate for the memory accesses"), OPT_BOOLEAN(0, "type-stat", &annotate.type_stat, "Show stats for the data type annotation"), + OPT_BOOLEAN(0, "insn-stat", &annotate.insn_stat, + "Show instruction stats for the data type annotation"), OPT_END() }; int ret; diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 72b867001e22..3f3cc7ae751f 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -100,6 +100,8 @@ static struct ins_ops nop_ops; static struct ins_ops lock_ops; static struct ins_ops ret_ops; +LIST_HEAD(ann_insn_stat); + static int arch__grow_instructions(struct arch *arch) { struct ins *new_instructions; @@ -3628,6 +3630,30 @@ static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip) return NULL; } +static struct annotated_item_stat *annotate_data_stat(struct list_head *head, + const char *name) +{ + struct annotated_item_stat *istat; + + list_for_each_entry(istat, head, list) { + if (!strcmp(istat->name, name)) + return istat; + } + + istat = zalloc(sizeof(*istat)); + if (istat == NULL) + return NULL; + + istat->name = strdup(name); + if (istat->name == NULL) { + free(istat); + return NULL; + } + + list_add_tail(&istat->list, head); + return istat; +} + /** * hist_entry__get_data_type - find data type for given hist entry * @he: hist entry @@ -3646,6 +3672,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) struct annotated_insn_loc loc; struct annotated_op_loc *op_loc; struct annotated_data_type *mem_type; + struct annotated_item_stat *istat; u64 ip = he->ip; int i; @@ -3674,8 +3701,15 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) return NULL; } + istat = annotate_data_stat(&ann_insn_stat, dl->ins.name); + if (istat == NULL) { + ann_data_stat.no_insn++; + return NULL; + } + if (annotate_get_insn_location(arch, dl, &loc) < 0) { ann_data_stat.no_insn_ops++; + istat->bad++; return NULL; } @@ -3684,6 +3718,10 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) continue; mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset); + if (mem_type) + istat->good++; + else + istat->bad++; if (symbol_conf.annotate_data_sample) { annotated_data_type__update_samples(mem_type, evsel, @@ -3696,5 +3734,6 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) } ann_data_stat.no_mem_ops++; + istat->bad++; return NULL; } diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 299b4a18e804..5bb831445cbe 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -477,4 +477,12 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl, /* Returns a data type from the sample instruction (if any) */ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he); +struct annotated_item_stat { + struct list_head list; + char *name; + int good; + int bad; +}; +extern struct list_head ann_insn_stat; + #endif /* __PERF_ANNOTATE_H */ From patchwork Thu Oct 12 03:50:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418232 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C5A44EC7; Thu, 12 Oct 2023 03:52:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="XINusBF0" Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05BE6D5D; Wed, 11 Oct 2023 20:51:48 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1c9d3a21f7aso4510215ad.2; Wed, 11 Oct 2023 20:51:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082706; x=1697687506; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=tHRD9kCjTkWVMzvqCRV9itspmVJvTLZ+rMPWUUQpDy0=; b=XINusBF0gHNj2rSlBlEKNCJQL6v1m3lkiON5G2iLAPI0iPBX+YAT+RwHnqKlvdWoXN cs6CHDboOCn9YiPjzGbX2cdLyp8uA/lvtc0JDwJUmD4XCg1ZO/7nicyAmxpzgdlJmLfo wz7QyDGpanJRNZweugZfPasBV2QvEUxFX+oGSkwEOQ2vDDRevXHBf5KkIHvicmrXfZbh hQ6FpKlid2nmIeS/BobA7vgRio6qaKS6Y1iJRHLtLpAE1YvF5tynrQrEU5Q2tgsBr5ND lFPm4kx0jiUYcxJjg4nooCLgj2ZOl9aoX/Fg2rDdgdRZ5j6Z7ySsSgcbAI5uOXpJxlQv UG3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082706; x=1697687506; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=tHRD9kCjTkWVMzvqCRV9itspmVJvTLZ+rMPWUUQpDy0=; b=O7rpZ+4Eu+XZ+DKI12f8VmwVyiDSx8REbRkvWB1IlBWMBA2O7wrJ+NnDVvrzJAA0po +edz6fuGQPPHJBfon0LI5K9cZTe2JwkeBdXIDsnS6NpCdrx5emNQy7Zb0i7HoKR6aPwD pUrxoBXILYuVt+h8BMDKWZjNS+aORN5aaO0MxOyw7xhd1okUhjdkRgkxhT1jh5DcP8hG KcIQqnVA8Iv2SvUQk9TCdlEPvCwGa0/ZHAGHGkRj/j38x8uDIJ0NER1SwoSCE3yRUE1y mii0dQTvCdU71r4iIXh7RBjQl5MzZ1fnD25Y85ChPIAJegzlAZAboIeRzQ/gKtsDqc5D ikAQ== X-Gm-Message-State: AOJu0Yw5Fqe5ShbSwoFKUd0dbQecetngfXtjKfsCkpJ03NH2BC+3EZtB tiDtCtWhFhhKpHq8sOVpL4g= X-Google-Smtp-Source: AGHT+IHYqRXVv5ofdwjZ4bLJK/vcIMyvy7EqEslLjOrb45XJPQdU7HOk/f994qTsqhY2zuxlM87Dng== X-Received: by 2002:a17:902:e84a:b0:1c2:218c:3754 with SMTP id t10-20020a170902e84a00b001c2218c3754mr28229405plg.53.1697082706299; Wed, 11 Oct 2023 20:51:46 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:45 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 24/48] perf annotate-data: Parse 'lock' prefix from llvm-objdump Date: Wed, 11 Oct 2023 20:50:47 -0700 Message-ID: <20231012035111.676789-25-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net For the performance reason, I prefer llvm-objdump over GNU's. But I found that llvm-objdump puts x86 lock prefix in a separate line like below. ffffffff81000695: f0 lock ffffffff81000696: ff 83 54 0b 00 00 incl 2900(%rbx) This should be parsed properly, but I just changed to find the insn with next offset for now. This improves the statistics as it can process more instructions. Annotate data type stats: total 294, ok 144 (49.0%), bad 150 (51.0%) ----------------------------------------------------------- 30 : no_sym 35 : no_mem_ops 71 : no_var 6 : no_typeinfo 8 : bad_offset Signed-off-by: Namhyung Kim --- tools/perf/util/annotate.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 3f3cc7ae751f..190489df0fb7 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3624,8 +3624,17 @@ static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip) notes = symbol__annotation(sym); list_for_each_entry(dl, ¬es->src->source, al.node) { - if (sym->start + dl->al.offset == ip) + if (sym->start + dl->al.offset == ip) { + /* + * llvm-objdump places "lock" in a separate line and + * in that case, we want to get the next line. + */ + if (!strcmp(dl->ins.name, "lock") && *dl->ops.raw == '\0') { + ip++; + continue; + } return dl; + } } return NULL; } @@ -3717,6 +3726,9 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) if (!op_loc->mem_ref) continue; + /* Recalculate IP since it can be changed due to LOCK prefix */ + ip = ms->sym->start + dl->al.offset; + mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset); if (mem_type) istat->good++; From patchwork Thu Oct 12 03:50:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418234 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E6A6ECB; Thu, 12 Oct 2023 03:52:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="aJhVsW94" Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 84FC7D6B; Wed, 11 Oct 2023 20:51:49 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-690bd8f89baso426268b3a.2; Wed, 11 Oct 2023 20:51:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082708; x=1697687508; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=oKck2Vhi5vdIr2Ld7wZJwpnMfvRFO2t6RvphUhl2GmU=; b=aJhVsW94p4EMEzWBitzN66LyGAHkhjlH/fFLNb6WrYlQS1fo/9JIfaNXTy7qw542dd 0bXwYElhfYE1ICHp+uQF3e+UFEnDMyOfoURvSuGT27PT6Wj6sIodf3nxSFYzZAp0bmbS VA4ehgPpmQwNUK9ZoH/UUaRt6GZ2RqooRFBT3o6VmR6Py2IR09Cvzw9XEfns8NpIxZOD AcjC3cvS8d8sg+fL2cHXb1EIMlkvQzsSaxbJw+pAuN3kT6KIs3cOURcbekY32vzXuMi4 lRtjv5XwP6Lo2erH5SiL02aTG6HGjX4zoEGai3jhCMOTh/SUk+x9v1pqbi9wmGT4aMW7 +JiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082708; x=1697687508; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=oKck2Vhi5vdIr2Ld7wZJwpnMfvRFO2t6RvphUhl2GmU=; b=sbUVnQrahg40eLsuOGrDkXc2fD3vABEI1UJr/O2gDNj4g3ByE2ez1kKWXdjAMtyN83 94bRbz+swdiV3xJ52bq87aSMb+SEOyf2YBMbA3qC220ZpkLWzjFtuv4JFcWA1sFNMEa1 ILr7fBbZlib00i1Qj7u/jFBcfGWK4UyXFJAedhWtYOd+G1johj21rz4wzPimTioC/AXY 8ENGxGNNVj53DjWWqDfuyOnE2VwLl1u8fdJYVK5zs5/5kFq2i27JtjrjbnnZrmw4nWzT EtCGt12INQo2A9chWnL+IEdM+wOt0HqDybWcQMZ+pKZS2kWztmXmEFJKVSE9XO2RUkhb fEIQ== X-Gm-Message-State: AOJu0Ywq1uiekOPrpkg5UxD9qUYSLObJtwgm3IL7anS66neb3JqSw4LT Dccu/OtOy6WVh8S+z+2ve0U= X-Google-Smtp-Source: AGHT+IHRs3gHC887q/Hhr+I+GDwkm7wMt+t7h0Y9L9HACBo7w4oB1LkKVftmQUR2Mg6Vk79ZbXDZhg== X-Received: by 2002:a05:6a20:54a4:b0:159:b7ba:74bd with SMTP id i36-20020a056a2054a400b00159b7ba74bdmr24119644pzk.50.1697082707646; Wed, 11 Oct 2023 20:51:47 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:47 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 25/48] perf annotate-data: Handle macro fusion on x86 Date: Wed, 11 Oct 2023 20:50:48 -0700 Message-ID: <20231012035111.676789-26-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net When a sample was come from a conditional branch without a memory operand, it could be due to a macro fusion with a previous instruction. So it needs to check the memory operand in the previous one. This improves the stat like below: Annotate data type stats: total 294, ok 147 (50.0%), bad 147 (50.0%) ----------------------------------------------------------- 30 : no_sym 32 : no_mem_ops 71 : no_var 6 : no_typeinfo 8 : bad_offset Signed-off-by: Namhyung Kim --- tools/perf/util/annotate.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 190489df0fb7..b0893d8f2ae3 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3710,6 +3710,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) return NULL; } +retry: istat = annotate_data_stat(&ann_insn_stat, dl->ins.name); if (istat == NULL) { ann_data_stat.no_insn++; @@ -3726,7 +3727,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) if (!op_loc->mem_ref) continue; - /* Recalculate IP since it can be changed due to LOCK prefix */ + /* Recalculate IP because of LOCK prefix or insn fusion */ ip = ms->sym->start + dl->al.offset; mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset); @@ -3745,6 +3746,20 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) return mem_type; } + /* + * Some instructions can be fused and the actual memory access came + * from the previous instruction. + */ + if (dl->al.offset > 0) { + struct disasm_line *prev_dl; + + prev_dl = list_prev_entry(dl, al.node); + if (ins__is_fused(arch, prev_dl->ins.name, dl->ins.name)) { + dl = prev_dl; + goto retry; + } + } + ann_data_stat.no_mem_ops++; istat->bad++; return NULL; From patchwork Thu Oct 12 03:50:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418233 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 51B0CEC8; Thu, 12 Oct 2023 03:52:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="fa2vT/oP" Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0BAD8D6E; Wed, 11 Oct 2023 20:51:50 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1bdf4752c3cso4013015ad.2; Wed, 11 Oct 2023 20:51:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082709; x=1697687509; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=9YslnAGV70/iCzljD98XKB9HW4v1DXAU/2DmY3k4p+M=; b=fa2vT/oPA/nUjf/S/ehEoCnJyQzPyrKz9Kbm9t061bdpSFbqi+ldLWyJxybR80pBIa 2gjGg01A8ju5nrCdMkAg4yzitg4km9D6hTvAaqZAzGjiVJ8tk0Xv2pWDfVRB0vK1Tjcg DFUcMgPRUkumEGgW/RCdp4jVGjP68pSR+ylGkZnHKAMnRdPmAC1bUspLC2LG1Ty7hPti R2j5NCDYxXMKcozHmlCD1P8OiaCFvPHGlekpq5NAIu61bZ8rEXaiZ51pNDwUQ1hjgIBH pTtlu0uQcfpoJp5PJjlIvVxNZUVFr3NQ9Yt4RIiW37VAd0VOZGELmdyzqFn0hFQvOnEl IILQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082709; x=1697687509; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=9YslnAGV70/iCzljD98XKB9HW4v1DXAU/2DmY3k4p+M=; b=TO12gZR2RGpRM6sDvzxzEG4IG7uZY1L3hhMncx3zlIAYx7xMJvbLTAd7E00MMurB1N 4WDqYfnnw76wT62irdB8OnPNZALOo4s5K0PMMNQNdG7DvnqAExoydvfII8uVNuWHoThz Vpw3LpYgAcxhWJqqeV0EoCVKf+cvANiBmJrHqhjivDojpDvxE3ra+rn/6dI/2eS8fc9V BBU8WdvU0dFyWDsCil7HBwppe8jzx2NPsjj4YCMl6muHAD2LOusNZldK2xlxBU8yHy4X uWTqH06lrfd69yHO/1HCq9BpFHVSlteV3cZQr6x8RxLx6N/ZXK9kd3HQ2a0RfaNIvme6 +00w== X-Gm-Message-State: AOJu0YzdT1iBD3sSGwjx9BbSN/0fkmUAPGHIuY+IiaKX39DQamu3Oxbr TcqwwhEWDPaSqGGIP0A3g54= X-Google-Smtp-Source: AGHT+IEaWi167GBHPv1nwrBVuPUR2vxZsylelf3TI9D8iFDlsKWOCjsCS4AOVFxBIToZDB8WrA+Iww== X-Received: by 2002:a17:903:18a:b0:1c8:91d8:d5ca with SMTP id z10-20020a170903018a00b001c891d8d5camr14796339plg.42.1697082708952; Wed, 11 Oct 2023 20:51:48 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:48 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 26/48] perf annotate-data: Handle array style accesses Date: Wed, 11 Oct 2023 20:50:49 -0700 Message-ID: <20231012035111.676789-27-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net On x86, instructions for array access often looks like below. mov 0x1234(%rax,%rbx,8), %rcx Usually the first register holds the type information and the second one has the index. And the current code only looks up a variable for the first register. But it's possible to be in the other way around so it needs to check the second register if the first one failed. The stat changed like this. Annotate data type stats: total 294, ok 148 (50.3%), bad 146 (49.7%) ----------------------------------------------------------- 30 : no_sym 32 : no_mem_ops 66 : no_var 10 : no_typeinfo 8 : bad_offset Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 24 +++++++++++++----- tools/perf/util/annotate-data.h | 5 ++-- tools/perf/util/annotate.c | 43 ++++++++++++++++++++++++++------- tools/perf/util/annotate.h | 8 ++++-- 4 files changed, 61 insertions(+), 19 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 3e30e6855ba8..bf6d53705af3 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -8,6 +8,7 @@ #include #include +#include "annotate.h" #include "annotate-data.h" #include "debuginfo.h" #include "debug.h" @@ -217,7 +218,8 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset) * It expects a pointer type for a memory access. * Convert to a real type it points to. */ - if (dwarf_tag(type_die) != DW_TAG_pointer_type || + if ((dwarf_tag(type_die) != DW_TAG_pointer_type && + dwarf_tag(type_die) != DW_TAG_array_type) || die_get_real_type(type_die, type_die) == NULL) { pr_debug("no pointer or no type\n"); ann_data_stat.no_typeinfo++; @@ -243,10 +245,11 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset) /* The result will be saved in @type_die */ static int find_data_type_die(struct debuginfo *di, u64 pc, - int reg, int offset, Dwarf_Die *type_die) + struct annotated_op_loc *loc, Dwarf_Die *type_die) { Dwarf_Die cu_die, var_die; Dwarf_Die *scopes = NULL; + int reg, offset; int ret = -1; int i, nr_scopes; @@ -260,6 +263,10 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, /* Get a list of nested scopes - i.e. (inlined) functions and blocks. */ nr_scopes = die_get_scopes(&cu_die, pc, &scopes); + reg = loc->reg1; + offset = loc->offset; + +retry: /* Search from the inner-most scope to the outer */ for (i = nr_scopes - 1; i >= 0; i--) { /* Look up variables/parameters in this scope */ @@ -270,6 +277,12 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, ret = check_variable(&var_die, type_die, offset); goto out; } + + if (loc->multi_regs && reg == loc->reg1 && loc->reg1 != loc->reg2) { + reg = loc->reg2; + goto retry; + } + if (ret < 0) ann_data_stat.no_var++; @@ -282,15 +295,14 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, * find_data_type - Return a data type at the location * @ms: map and symbol at the location * @ip: instruction address of the memory access - * @reg: register that holds the base address - * @offset: offset from the base address + * @loc: instruction operand location * * This functions searches the debug information of the binary to get the data * type it accesses. The exact location is expressed by (ip, reg, offset). * It return %NULL if not found. */ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, - int reg, int offset) + struct annotated_op_loc *loc) { struct annotated_data_type *result = NULL; struct dso *dso = ms->map->dso; @@ -310,7 +322,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, * a file address for DWARF processing. */ pc = map__rip_2objdump(ms->map, ip); - if (find_data_type_die(di, pc, reg, offset, &type_die) < 0) + if (find_data_type_die(di, pc, loc, &type_die) < 0) goto out; result = dso__findnew_data_type(dso, &type_die); diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index 8e73096c01d1..65ddd839850f 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -7,6 +7,7 @@ #include #include +struct annotated_op_loc; struct evsel; struct map_symbol; @@ -105,7 +106,7 @@ extern struct annotated_data_stat ann_data_stat; /* Returns data type at the location (ip, reg, offset) */ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, - int reg, int offset); + struct annotated_op_loc *loc); /* Update type access histogram at the given offset */ int annotated_data_type__update_samples(struct annotated_data_type *adt, @@ -119,7 +120,7 @@ void annotated_data_type__tree_delete(struct rb_root *root); static inline struct annotated_data_type * find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused, - int reg __maybe_unused, int offset __maybe_unused) + struct annotated_op_loc *loc __maybe_unused) { return NULL; } diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index b0893d8f2ae3..ccd1200746dd 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3527,8 +3527,22 @@ static int extract_reg_offset(struct arch *arch, const char *str, if (regname == NULL) return -1; - op_loc->reg = get_dwarf_regnum(regname, 0); + op_loc->reg1 = get_dwarf_regnum(regname, 0); free(regname); + + /* Get the second register */ + if (op_loc->multi_regs) { + p = strchr(p + 1, arch->objdump.register_char); + if (p == NULL) + return -1; + + regname = strdup(p); + if (regname == NULL) + return -1; + + op_loc->reg2 = get_dwarf_regnum(regname, 0); + free(regname); + } return 0; } @@ -3541,14 +3555,20 @@ static int extract_reg_offset(struct arch *arch, const char *str, * Get detailed location info (register and offset) in the instruction. * It needs both source and target operand and whether it accesses a * memory location. The offset field is meaningful only when the - * corresponding mem flag is set. + * corresponding mem flag is set. The reg2 field is meaningful only + * when multi_regs flag is set. * * Some examples on x86: * - * mov (%rax), %rcx # src_reg = rax, src_mem = 1, src_offset = 0 - * # dst_reg = rcx, dst_mem = 0 + * mov (%rax), %rcx # src_reg1 = rax, src_mem = 1, src_offset = 0 + * # dst_reg1 = rcx, dst_mem = 0 * - * mov 0x18, %r8 # src_reg = -1, dst_reg = r8 + * mov 0x18, %r8 # src_reg1 = -1, src_mem = 0 + * # dst_reg1 = r8, dst_mem = 0 + * + * mov %rsi, 8(%rbx,%rcx,4) # src_reg1 = rsi, src_mem = 0, dst_multi_regs = 0 + * # dst_reg1 = rbx, dst_reg2 = rcx, dst_mem = 1 + * # dst_multi_regs = 1, dst_offset = 8 */ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl, struct annotated_insn_loc *loc) @@ -3569,24 +3589,29 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl, for_each_insn_op_loc(loc, i, op_loc) { const char *insn_str = ops->source.raw; + bool multi_regs = ops->source.multi_regs; - if (i == INSN_OP_TARGET) + if (i == INSN_OP_TARGET) { insn_str = ops->target.raw; + multi_regs = ops->target.multi_regs; + } /* Invalidate the register by default */ - op_loc->reg = -1; + op_loc->reg1 = -1; + op_loc->reg2 = -1; if (insn_str == NULL) continue; if (strchr(insn_str, arch->objdump.memory_ref_char)) { op_loc->mem_ref = true; + op_loc->multi_regs = multi_regs; extract_reg_offset(arch, insn_str, op_loc); } else { char *s = strdup(insn_str); if (s) { - op_loc->reg = get_dwarf_regnum(s, 0); + op_loc->reg1 = get_dwarf_regnum(s, 0); free(s); } } @@ -3730,7 +3755,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) /* Recalculate IP because of LOCK prefix or insn fusion */ ip = ms->sym->start + dl->al.offset; - mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset); + mem_type = find_data_type(ms, ip, op_loc); if (mem_type) istat->good++; else diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 5bb831445cbe..18a81faeb44b 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -440,14 +440,18 @@ int annotate_check_args(struct annotation_options *args); /** * struct annotated_op_loc - Location info of instruction operand - * @reg: Register in the operand + * @reg1: First register in the operand + * @reg2: Second register in the operand * @offset: Memory access offset in the operand * @mem_ref: Whether the operand accesses memory + * @multi_regs: Whether the second register is used */ struct annotated_op_loc { - int reg; + int reg1; + int reg2; int offset; bool mem_ref; + bool multi_regs; }; enum annotated_insn_ops { From patchwork Thu Oct 12 03:50:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418235 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 308C11370; Thu, 12 Oct 2023 03:52:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="eIshufyT" Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9CAC0E5; Wed, 11 Oct 2023 20:51:51 -0700 (PDT) Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-1c737d61a00so4806775ad.3; Wed, 11 Oct 2023 20:51:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082710; x=1697687510; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=Jbp8DzEFzCS/HQQoHHK2CySAUhB+wIeKLpuPuedDb9Q=; b=eIshufyTBmJATgHDh2W+F/A3w8jg6pCBc7eQ/oqePTnTkdo1wB2LEhNKcO20a8uJNa rUIug7y6dADWv8ZajwWvtbqkUmAC93JcZYRu75j+YwYGYmoxtOkpXC0hW6MM4y4NmWKL MR8kvm8yyKc1Wbndousp/H34GKt4WbXVKXlfdHoI3yM3aKyq9bsppjpQOyHXVOqNUa4F MnoS/1/yc2E3OVOxSggYfsjqEMToeSHkGA5oeeJCzN7kqfhepEEF/V5S20IWYev4Dmqm K1kEFyqkP/iN0qMRJvVd7c8qh0/ZNKfzYLbs7FGEzmIT/bj/78gmGcI0GamBWGvNGN3D PoxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082710; x=1697687510; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Jbp8DzEFzCS/HQQoHHK2CySAUhB+wIeKLpuPuedDb9Q=; b=dYr4qCfslNoi+XVJgaFi6B9HW9RzhrlvDekrhO8VX6zBGQW/zTymC4L4Qlsn7RTzni 4jeKf6uO+hRK65HHwMomOsx44llDhXgdJk8bMTecYJdwOzrRFcpDaWkDBjZ3IeQtFWza K9cIMuq0viF3tf2IwyZXos4rlvOyVcxAwGEKazSOXzODKiAq4g19gVdi6nhuIBDv+6yL BOt/HejNCph4rYOrQ5lnzBcyYviF2mIw407x4kRLcrraX9JGBLR3mb0QIdWj1+lAeTpd Wqm1UwKzwFn26ma5p6ffeezCfdcg5Yu6Fv+zLVYibPB2qfJ8UY7jzUb/nmTNJabkfb6A Dplg== X-Gm-Message-State: AOJu0Yw2Y1t+Ajro2U1VPFmGoSn4OwI8E+KF7BObEKqWWhXKxxDpthrk bB4RaawGXIRwXlgJTgSLJLs= X-Google-Smtp-Source: AGHT+IEIMKTTxCxxaR8KKMtaMId7cZL9K6kxGCgzNuDeuV+kMlZJPgPyAQpBd60zXkOUZ47ArbzZ7A== X-Received: by 2002:a17:902:f546:b0:1c5:b622:6fcd with SMTP id h6-20020a170902f54600b001c5b6226fcdmr29003301plf.22.1697082710218; Wed, 11 Oct 2023 20:51:50 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:49 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 27/48] perf annotate-data: Add stack operation pseudo type Date: Wed, 11 Oct 2023 20:50:50 -0700 Message-ID: <20231012035111.676789-28-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net A typical function prologue and epilogue include multiple stack operations to save and restore the current value of registers. On x86, it looks like below: push r15 push r14 push r13 push r12 ... pop r12 pop r13 pop r14 pop r15 ret As these all touches the stack memory region, chances are high that they appear in a memory profile data. But these are not used for any real purpose yet so it'd return no types. One of my profile type shows that non neglible portion of data came from the stack operations. It also seems GCC generates more stack operations than clang. Annotate Instruction stats total 264, ok 169 (64.0%), bad 95 (36.0%) Name : Good Bad ----------------------------------------------------------- movq : 49 27 movl : 24 9 popq : 0 19 <-- here cmpl : 17 2 addq : 14 1 cmpq : 12 2 cmpxchgl : 3 7 Instead of dealing them as unknown, let's create a seperate pseudo type to represent those stack operations separately. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 7 +++++++ tools/perf/util/annotate-data.h | 1 + tools/perf/util/annotate.c | 18 ++++++++++++++++++ 3 files changed, 26 insertions(+) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index bf6d53705af3..a4276106e8a8 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -29,6 +29,13 @@ struct annotated_data_type unknown_type = { }, }; +struct annotated_data_type stackop_type = { + .self = { + .type_name = (char *)"(stack operation)", + .children = LIST_HEAD_INIT(stackop_type.self.children), + }, +}; + /* Data type collection debug statistics */ struct annotated_data_stat ann_data_stat; diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index 65ddd839850f..214c625e7bc9 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -70,6 +70,7 @@ struct annotated_data_type { }; extern struct annotated_data_type unknown_type; +extern struct annotated_data_type stackop_type; /** * struct annotated_data_stat - Debug statistics diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index ccd1200746dd..dbbd349e67fc 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3688,6 +3688,18 @@ static struct annotated_item_stat *annotate_data_stat(struct list_head *head, return istat; } +static bool is_stack_operation(struct arch *arch, struct disasm_line *dl) +{ + if (arch__is(arch, "x86")) { + if (!strncmp(dl->ins.name, "push", 4) || + !strncmp(dl->ins.name, "pop", 3) || + !strncmp(dl->ins.name, "ret", 3)) + return true; + } + + return false; +} + /** * hist_entry__get_data_type - find data type for given hist entry * @he: hist entry @@ -3748,6 +3760,12 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) return NULL; } + if (is_stack_operation(arch, dl)) { + istat->good++; + he->mem_type_off = 0; + return &stackop_type; + } + for_each_insn_op_loc(&loc, i, op_loc) { if (!op_loc->mem_ref) continue; From patchwork Thu Oct 12 03:50:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418236 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16DAC10E8; Thu, 12 Oct 2023 03:52:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HSwlYPSD" Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EB82D7F; Wed, 11 Oct 2023 20:51:52 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1c9b95943beso4894025ad.1; Wed, 11 Oct 2023 20:51:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082711; x=1697687511; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=bdbuy60PoHQjVoDHAgnvJ5vo16j0MWSlUix9kXJv1e0=; b=HSwlYPSD6ohCFcYWRM4CoUg3Mdw5nNjSNWYRQwqCOUnqHC2Xo/dsJ2d2xwZC1VCo6X Z/jnnNxHP+xqAqLVh7FW7v8JIC3bYK13Hgco7P1XrS+0F07n7vLDnaYHHgEDFaBLtNES /PzwvVOiYewhUSB++f1fefXAhQT3R+2dhaUXnx1bmJNmNuD44bgGALRHxj2OYhixHOAA O1hFKLq7v1qfgDtmg7Zb0wqJhr7sUalKaUOt687ZuARTdmYIWOe6go3CxT/t8WH+rNDS OUSaal4buDL0B6Q3oIyVHTe2h5X606Jb+lq1PRmXE9NiuSr9ynuAXM7WyirwuPSJnRS2 lOpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082711; x=1697687511; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=bdbuy60PoHQjVoDHAgnvJ5vo16j0MWSlUix9kXJv1e0=; b=UDoBb3X9nKPa2kjdbYvb3mAGxSPd4l7Aibaf9EYX6nNGpkEodCoA9ux7i9XogdLyKq ZLLbi53hPqKjjsw7E0pJUW0GOO8A/PV+l7xqAer09OSS78caaIIoegCBwGiOyCN9x0Hf Ygn2xZa599z6ySRsmpVpNBThqniaSTYeyuHO/YgL09bhUJ6aCTF188JxcaOz9QdMfK9p 0QTLCJOv+ZPr0DLrFxv2TrKzMVwFl5WRN9oHncD59rA4BrjDEtK9uxHe7iZ+np4Cuyh/ LmgrbXSbnIS5gpWr9dMVpQXydZOnalsQ22NyBYwVh4EtfFrUXXV0y77foUE57Z7pcjWR D2ig== X-Gm-Message-State: AOJu0YxPVyQS4I9Z5c3XfIVd3zZHLVGkJem7bmK1mOkQE/fja9h6k+nR E8CiN+18mAdirOdQmWuuanQ= X-Google-Smtp-Source: AGHT+IHf87NQ/5wr3XxYc83YP6tXdjB/pWdLpx11l2tqBcD5VauBFA8S86Dk6aZYc9EzTQ0FpHpkSA== X-Received: by 2002:a17:903:48d:b0:1c9:dff1:6ddd with SMTP id jj13-20020a170903048d00b001c9dff16dddmr588990plb.35.1697082711592; Wed, 11 Oct 2023 20:51:51 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:51 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 28/48] perf dwarf-aux: Add die_find_variable_by_addr() Date: Wed, 11 Oct 2023 20:50:51 -0700 Message-ID: <20231012035111.676789-29-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The die_find_variable_by_addr() is to find a variables in the given DIE using given (PC-relative) address. Global variables will have a location expression with DW_OP_addr which has an address so can simply compare it with the address. <1><143a7>: Abbrev Number: 2 (DW_TAG_variable) <143a8> DW_AT_name : loops_per_jiffy <143ac> DW_AT_type : <0x1cca> <143b0> DW_AT_external : 1 <143b0> DW_AT_decl_file : 193 <143b1> DW_AT_decl_line : 213 <143b2> DW_AT_location : 9 byte block: 3 b0 46 41 82 ff ff ff ff (DW_OP_addr: ffffffff824146b0) Note that the type-offset should be calculated from the base address of the global variable. Signed-off-by: Namhyung Kim Acked-by: Masami Hiramatsu (Google) --- tools/perf/util/dwarf-aux.c | 80 +++++++++++++++++++++++++++++++++++++ tools/perf/util/dwarf-aux.h | 14 +++++++ 2 files changed, 94 insertions(+) diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index 5bb05c84d249..97d9ae56350e 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -1266,8 +1266,12 @@ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf) struct find_var_data { /* Target instruction address */ Dwarf_Addr pc; + /* Target memory address (for global data) */ + Dwarf_Addr addr; /* Target register */ unsigned reg; + /* Access offset, set for global data */ + int offset; }; /* Max number of registers DW_OP_regN supports */ @@ -1328,6 +1332,82 @@ Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg, }; return die_find_child(sc_die, __die_find_var_reg_cb, &data, die_mem); } + +/* Only checks direct child DIEs in the given scope */ +static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg) +{ + struct find_var_data *data = arg; + int tag = dwarf_tag(die_mem); + ptrdiff_t off = 0; + Dwarf_Attribute attr; + Dwarf_Addr base, start, end; + Dwarf_Word size; + Dwarf_Die type_die; + Dwarf_Op *ops; + size_t nops; + + if (tag != DW_TAG_variable) + return DIE_FIND_CB_SIBLING; + + if (dwarf_attr(die_mem, DW_AT_location, &attr) == NULL) + return DIE_FIND_CB_SIBLING; + + while ((off = dwarf_getlocations(&attr, off, &base, &start, &end, &ops, &nops)) > 0) { + if (ops->atom != DW_OP_addr) + continue; + + if (data->addr < ops->number) + continue; + + if (data->addr == ops->number) { + /* Update offset relative to the start of the variable */ + data->offset = 0; + return DIE_FIND_CB_END; + } + + if (die_get_real_type(die_mem, &type_die) == NULL) + continue; + + if (dwarf_aggregate_size(&type_die, &size) < 0) + continue; + + if (data->addr >= ops->number + size) + continue; + + /* Update offset relative to the start of the variable */ + data->offset = data->addr - ops->number; + return DIE_FIND_CB_END; + } + return DIE_FIND_CB_SIBLING; +} + +/** + * die_find_variable_by_addr - Find variable located at given address + * @sc_die: a scope DIE + * @pc: the program address to find + * @addr: the data address to find + * @die_mem: a buffer to save the resulting DIE + * @offset: the offset in the resulting type + * + * Find the variable DIE located at the given address (in PC-relative mode). + * This is usually for global variables. + */ +Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die, Dwarf_Addr pc, + Dwarf_Addr addr, Dwarf_Die *die_mem, + int *offset) +{ + struct find_var_data data = { + .pc = pc, + .addr = addr, + }; + Dwarf_Die *result; + + result = die_find_child(sc_die, __die_find_var_addr_cb, &data, die_mem); + if (result) + *offset = data.offset; + return result; +} + #endif /* diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h index 574405c57d3b..742098e3ee7e 100644 --- a/tools/perf/util/dwarf-aux.h +++ b/tools/perf/util/dwarf-aux.h @@ -144,6 +144,11 @@ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf); Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg, Dwarf_Die *die_mem); +/* Find a (global) variable located in the 'addr' */ +Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die, Dwarf_Addr pc, + Dwarf_Addr addr, Dwarf_Die *die_mem, + int *offset); + #else /* HAVE_DWARF_GETLOCATIONS_SUPPORT */ static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused, @@ -161,6 +166,15 @@ static inline Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die __maybe_unus return NULL; } +static inline Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die __maybe_unused, + Dwarf_Addr pc __maybe_unused, + Dwarf_Addr addr __maybe_unused, + Dwarf_Die *die_mem __maybe_unused, + int *offset __maybe_unused) +{ + return NULL; +} + #endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */ #endif /* _DWARF_AUX_H */ From patchwork Thu Oct 12 03:50:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418237 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D780610F3; Thu, 12 Oct 2023 03:52:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WuJyjeFH" Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2D0310CA; Wed, 11 Oct 2023 20:51:54 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-691c05bc5aaso469296b3a.2; Wed, 11 Oct 2023 20:51:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082713; x=1697687513; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=D/3AXS9Rj9IPcZO9IlTYzpMxmpnZprD+7U+lZk0v8yU=; b=WuJyjeFH3yZpX5R/DtMMTxa5B0IdPLBSjRDW3meBBJp3hok+L1kiKtN3rczG3wCk5q wJjRZCRBg3fdOgFH5Qwywpe2sTn9ecuLe85a25Da+1Cp7OaEUrBoMcJbbpg7aTNA+aYb 2b4EVsSnqU2Kx8+XaLLLLf+xsuLOayxCopRF/pnu0xM1glnW79YSgxflAKzDHwSpRd3a vJz4CfdSk+gh5HPufqtmDtEjZOfwVt8WdOkd9kaz/aj02rJP6qH8QuuNgrKb8/wmKowz fU11vjC0VYjVJLcnoU3J1aIN771gzZTPOuLN98S1erOCSNWHsNS3e7CAYoW7jWK9rQ3j 4XKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082713; x=1697687513; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=D/3AXS9Rj9IPcZO9IlTYzpMxmpnZprD+7U+lZk0v8yU=; b=KyUk5Z2slsgsiGO8FqcHDNhVWUObVVfea3WTRdsa3UiMqVRvW1spVZsgpSCGU/eXmO IA6L94G7Zpvf91EuyH2tksi1gcEBaxg1Bni7qHM1ePtuHIX4+1K4ECc3KgqrCGIv9BgE Ah7NTuLZKtkNHz11fmJ99KRgjow69gmin9lsvG6/3GthkemyY4XgSNpXgoIeWPs3ZZ4k bHW0X89LIXdZovCWMd+ZxEctsYTN9BipTq22jkWGuVzg5aVBQF4vGnqOspZRpZQgQurS kngpT7A4UTIgvR9tVkp3FKdnvRY/KmD4gxzcY8L3rH5xXcIRp/nlULezCqH34b0TwGtM ED7A== X-Gm-Message-State: AOJu0YwrdZhHxaDyLH3HV4CZC1EL2N8jmbhZfYoiBlLrPGgiQ03wAuKZ 7d01o/jeNGv34sYLeLH9HEE= X-Google-Smtp-Source: AGHT+IGIGFc6nerncuK7S2GIAerHdtJKeM0g3yKnFCiw7Gu7PMhXZ3/WNNaFm483buOG4tJOKjNASQ== X-Received: by 2002:a05:6a21:6d88:b0:162:d056:9f52 with SMTP id wl8-20020a056a216d8800b00162d0569f52mr27347817pzb.14.1697082713018; Wed, 11 Oct 2023 20:51:53 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:52 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 29/48] perf annotate-data: Handle PC-relative addressing Date: Wed, 11 Oct 2023 20:50:52 -0700 Message-ID: <20231012035111.676789-30-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Extend find_data_type_die() to find data type from PC-relative address using die_find_variable_by_addr(). Users need to pass the address for the (global) variable. The offset for the variable should be updated after finding the type because the offset in the instruction is just to calcuate the address for the variable. So it changed to pass a pointer to offset and renamed it to 'poffset'. First it searches variables in the CU DIE as it's likely that the global variables are defined in the file level. And then it iterates the scope DIEs to find a local (static) variable. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 56 ++++++++++++++++++++++----------- 1 file changed, 38 insertions(+), 18 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index a4276106e8a8..3d4bd5040782 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -13,6 +13,7 @@ #include "debuginfo.h" #include "debug.h" #include "dso.h" +#include "dwarf-regs.h" #include "evsel.h" #include "evlist.h" #include "map.h" @@ -210,7 +211,8 @@ static bool find_cu_die(struct debuginfo *di, u64 pc, Dwarf_Die *cu_die) } /* The type info will be saved in @type_die */ -static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset) +static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset, + bool is_pointer) { Dwarf_Word size; @@ -222,15 +224,18 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset) } /* - * It expects a pointer type for a memory access. - * Convert to a real type it points to. + * Usually it expects a pointer type for a memory access. + * Convert to a real type it points to. But global variables + * are accessed directly without a pointer. */ - if ((dwarf_tag(type_die) != DW_TAG_pointer_type && - dwarf_tag(type_die) != DW_TAG_array_type) || - die_get_real_type(type_die, type_die) == NULL) { - pr_debug("no pointer or no type\n"); - ann_data_stat.no_typeinfo++; - return -1; + if (is_pointer) { + if ((dwarf_tag(type_die) != DW_TAG_pointer_type && + dwarf_tag(type_die) != DW_TAG_array_type) || + die_get_real_type(type_die, type_die) == NULL) { + pr_debug("no pointer or no type\n"); + ann_data_stat.no_typeinfo++; + return -1; + } } /* Get the size of the actual type */ @@ -251,7 +256,7 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset) } /* The result will be saved in @type_die */ -static int find_data_type_die(struct debuginfo *di, u64 pc, +static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, struct annotated_op_loc *loc, Dwarf_Die *type_die) { Dwarf_Die cu_die, var_die; @@ -267,21 +272,36 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, return -1; } - /* Get a list of nested scopes - i.e. (inlined) functions and blocks. */ - nr_scopes = die_get_scopes(&cu_die, pc, &scopes); - reg = loc->reg1; offset = loc->offset; + if (reg == DWARF_REG_PC && + die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) { + ret = check_variable(&var_die, type_die, offset, + /*is_pointer=*/false); + goto out; + } + + /* Get a list of nested scopes - i.e. (inlined) functions and blocks. */ + nr_scopes = die_get_scopes(&cu_die, pc, &scopes); + retry: /* Search from the inner-most scope to the outer */ for (i = nr_scopes - 1; i >= 0; i--) { - /* Look up variables/parameters in this scope */ - if (!die_find_variable_by_reg(&scopes[i], pc, reg, &var_die)) - continue; + if (reg == DWARF_REG_PC) { + if (!die_find_variable_by_addr(&scopes[i], pc, addr, + &var_die, &offset)) + continue; + } else { + /* Look up variables/parameters in this scope */ + if (!die_find_variable_by_reg(&scopes[i], pc, reg, + &var_die)) + continue; + } /* Found a variable, see if it's correct */ - ret = check_variable(&var_die, type_die, offset); + ret = check_variable(&var_die, type_die, offset, + reg != DWARF_REG_PC); goto out; } @@ -329,7 +349,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, * a file address for DWARF processing. */ pc = map__rip_2objdump(ms->map, ip); - if (find_data_type_die(di, pc, loc, &type_die) < 0) + if (find_data_type_die(di, pc, 0, loc, &type_die) < 0) goto out; result = dso__findnew_data_type(dso, &type_die); From patchwork Thu Oct 12 03:50:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418238 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6D7F1373; Thu, 12 Oct 2023 03:52:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jKPZLrrh" Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61FE710D7; Wed, 11 Oct 2023 20:51:55 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id 41be03b00d2f7-5859b1c92a0so416347a12.2; Wed, 11 Oct 2023 20:51:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082714; x=1697687514; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=JtpWSuKxD/2tbbvos5iko7H6sC+KDA3u3X+/mk0vrcI=; b=jKPZLrrhUKWSElEoYKJgcJFYglL8QxzTTPGM1apjsJ7o1zGMA/ne43lVWatb2q2IwG LmCrcbhHMKQa8hCRbVb4lZydJrhniwYl0bu2Nx0f7QdPa08eYkI1/aMPfwE8pW06r82c b7JT5stgm+hva876JrSrQbt6IZScwJ4a0zNcU1xWMA1UMB9TT1/a5l3yOfUy7JaEeNYd Nb/32UsoYkFf2UKHQgWDdXPajiX4V+VXWMRrdg6dXpZPY88Uhk5sVAo8/SpBojQWUC+M 3haaqhcBDaiwe8fVLTlRfV56Jym5p9JGMOdPpSDiESA+MyJEpiqFFXUJ5HQ1lpzMOPJZ cc1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082714; x=1697687514; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=JtpWSuKxD/2tbbvos5iko7H6sC+KDA3u3X+/mk0vrcI=; b=KSa7mmkr+J3xVCh7+Ag42SnSet+B60PPLO3sST5qX65/a0PpvtK6uqh/TxzSX89cd4 qjZwo/zAXuDIhFfYhPuFb9mZQnwQNKs4jj9iB6/knpoiE0XgqDbVlBjthgbs69tTzMQi J0QcWDAbuvTSbYlstjeACyXXjH7sdwplhV/i2xMI6ZpzBV2yN1puveaqdcLnehb9PqYz iDLTwgwEY0wsLTPjbx9ucDxY8Tl7dOQsuvyM891GpQ5LFW1R65kCSwyAQlDRkcAB7AVE JiLwMTMwmJDkR+7PkQCq8SVKl/QDCFaEIM8o1HG/itSGcj1aCpi65GeGwpq7Utg6fsDL 1IJA== X-Gm-Message-State: AOJu0YyEdhw9KjhsObr1IvQZ3VhbgXfDHB2bS8m1sqB8i5MEVm0QH/4C qpkzAmmFmC1aePTUTjISEkQ= X-Google-Smtp-Source: AGHT+IGYjGl4pDMZkUSrw3XBAdgmiyRWI3Vr1d4giFir8yMQ7vu/Eoa6xT1Wxhg2lqqZGxj9T/1elA== X-Received: by 2002:a17:902:c412:b0:1bd:aeb3:9504 with SMTP id k18-20020a170902c41200b001bdaeb39504mr33709683plk.15.1697082714332; Wed, 11 Oct 2023 20:51:54 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:54 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 30/48] perf annotate-data: Support global variables Date: Wed, 11 Oct 2023 20:50:53 -0700 Message-ID: <20231012035111.676789-31-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Global variables are accessed using PC-relative address so it needs to be handled separately. The PC-rel addressing is detected by using DWARF_REG_PC. On x86, %rip register would be used. The address can be calculated using the ip and offset in the instruction. But it should start from the next instruction so add calculate_pcrel_addr() to do it properly. But global variables defined in a different file would only have a declaration which doesn't include a location list. So it first tries to get the type info using the address, and then looks up the variable declarations using name. The name of global variables should be get from the symbol table. The declaration would have the type info. So extend find_var_type() to take both address and name for global variables. The stat is now looks like: Annotate data type stats: total 294, ok 153 (52.0%), bad 141 (48.0%) ----------------------------------------------------------- 30 : no_sym 32 : no_mem_ops 61 : no_var 10 : no_typeinfo 8 : bad_offset Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 38 ++++++++++++++++------ tools/perf/util/annotate-data.h | 6 ++-- tools/perf/util/annotate.c | 57 +++++++++++++++++++++++++++++++-- tools/perf/util/annotate.h | 4 +++ 4 files changed, 92 insertions(+), 13 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 3d4bd5040782..857e2fbe83f2 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -257,7 +257,8 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset, /* The result will be saved in @type_die */ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, - struct annotated_op_loc *loc, Dwarf_Die *type_die) + const char *var_name, struct annotated_op_loc *loc, + Dwarf_Die *type_die) { Dwarf_Die cu_die, var_die; Dwarf_Die *scopes = NULL; @@ -275,11 +276,21 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, reg = loc->reg1; offset = loc->offset; - if (reg == DWARF_REG_PC && - die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) { - ret = check_variable(&var_die, type_die, offset, - /*is_pointer=*/false); - goto out; + if (reg == DWARF_REG_PC) { + if (die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) { + ret = check_variable(&var_die, type_die, offset, + /*is_pointer=*/false); + loc->offset = offset; + goto out; + } + + if (var_name && die_find_variable_at(&cu_die, var_name, pc, + &var_die)) { + ret = check_variable(&var_die, type_die, 0, + /*is_pointer=*/false); + /* loc->offset will be updated by the caller */ + goto out; + } } /* Get a list of nested scopes - i.e. (inlined) functions and blocks. */ @@ -302,6 +313,7 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, /* Found a variable, see if it's correct */ ret = check_variable(&var_die, type_die, offset, reg != DWARF_REG_PC); + loc->offset = offset; goto out; } @@ -323,13 +335,21 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, * @ms: map and symbol at the location * @ip: instruction address of the memory access * @loc: instruction operand location + * @addr: data address of the memory access + * @var_name: global variable name * * This functions searches the debug information of the binary to get the data - * type it accesses. The exact location is expressed by (ip, reg, offset). + * type it accesses. The exact location is expressed by (@ip, reg, offset) + * for pointer variables or (@ip, @addr) for global variables. Note that global + * variables might update the @loc->offset after finding the start of the variable. + * If it cannot find a global variable by address, it tried to fine a declaration + * of the variable using @var_name. In that case, @loc->offset won't be updated. + * * It return %NULL if not found. */ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, - struct annotated_op_loc *loc) + struct annotated_op_loc *loc, u64 addr, + const char *var_name) { struct annotated_data_type *result = NULL; struct dso *dso = ms->map->dso; @@ -349,7 +369,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, * a file address for DWARF processing. */ pc = map__rip_2objdump(ms->map, ip); - if (find_data_type_die(di, pc, 0, loc, &type_die) < 0) + if (find_data_type_die(di, pc, addr, var_name, loc, &type_die) < 0) goto out; result = dso__findnew_data_type(dso, &type_die); diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index 214c625e7bc9..1b0db8e8c40e 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -107,7 +107,8 @@ extern struct annotated_data_stat ann_data_stat; /* Returns data type at the location (ip, reg, offset) */ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, - struct annotated_op_loc *loc); + struct annotated_op_loc *loc, u64 addr, + const char *var_name); /* Update type access histogram at the given offset */ int annotated_data_type__update_samples(struct annotated_data_type *adt, @@ -121,7 +122,8 @@ void annotated_data_type__tree_delete(struct rb_root *root); static inline struct annotated_data_type * find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused, - struct annotated_op_loc *loc __maybe_unused) + struct annotated_op_loc *loc __maybe_unused, + u64 addr __maybe_unused, const char *var_name __maybe_unused) { return NULL; } diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index dbbd349e67fc..fe0074bb98f0 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -37,6 +37,7 @@ #include "util/sharded_mutex.h" #include "arch/common.h" #include "namespaces.h" +#include "thread.h" #include #include #include @@ -3700,6 +3701,30 @@ static bool is_stack_operation(struct arch *arch, struct disasm_line *dl) return false; } +u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset, + struct disasm_line *dl) +{ + struct annotation *notes; + struct disasm_line *next; + u64 addr; + + notes = symbol__annotation(ms->sym); + /* + * PC-relative addressing starts from the next instruction address + * But the IP is for the current instruction. Since disasm_line + * doesn't have the instruction size, calculate it using the next + * disasm_line. If it's the last one, we can use symbol's end + * address directly. + */ + if (&dl->al.node == notes->src->source.prev) + addr = ms->sym->end + offset; + else { + next = list_next_entry(dl, al.node); + addr = ip + (next->al.offset - dl->al.offset) + offset; + } + return map__rip_2objdump(ms->map, addr); +} + /** * hist_entry__get_data_type - find data type for given hist entry * @he: hist entry @@ -3719,7 +3744,9 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) struct annotated_op_loc *op_loc; struct annotated_data_type *mem_type; struct annotated_item_stat *istat; - u64 ip = he->ip; + u64 ip = he->ip, addr = 0; + const char *var_name = NULL; + int var_offset; int i; ann_data_stat.total++; @@ -3773,12 +3800,38 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) /* Recalculate IP because of LOCK prefix or insn fusion */ ip = ms->sym->start + dl->al.offset; - mem_type = find_data_type(ms, ip, op_loc); + var_offset = op_loc->offset; + + /* PC-relative addressing */ + if (op_loc->reg1 == DWARF_REG_PC) { + struct addr_location al; + struct symbol *var; + u64 map_addr; + + addr = annotate_calc_pcrel(ms, ip, op_loc->offset, dl); + /* Kernel symbols might be relocated */ + map_addr = addr + map__reloc(ms->map); + + addr_location__init(&al); + var = thread__find_symbol_fb(he->thread, he->cpumode, + map_addr, &al); + if (var) { + var_name = var->name; + /* Calculate type offset from the start of variable */ + var_offset = map_addr - map__unmap_ip(al.map, var->start); + } + addr_location__exit(&al); + } + + mem_type = find_data_type(ms, ip, op_loc, addr, var_name); if (mem_type) istat->good++; else istat->bad++; + if (mem_type && var_name) + op_loc->offset = var_offset; + if (symbol_conf.annotate_data_sample) { annotated_data_type__update_samples(mem_type, evsel, op_loc->offset, diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 18a81faeb44b..99c8d30a2fa7 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -489,4 +489,8 @@ struct annotated_item_stat { }; extern struct list_head ann_insn_stat; +/* Calculate PC-relative address */ +u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset, + struct disasm_line *dl); + #endif /* __PERF_ANNOTATE_H */ From patchwork Thu Oct 12 03:50:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418239 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B6C61378; Thu, 12 Oct 2023 03:52:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Bt1cO1HK" Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F97A10E3; Wed, 11 Oct 2023 20:51:57 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1c88b467ef8so4754815ad.0; Wed, 11 Oct 2023 20:51:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082715; x=1697687515; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=IpeT682jXo8zHQtou6ZhNNKzbowV2y0VimL5UVLWBn8=; b=Bt1cO1HK3G6coroGMGSCNYFwRd4qnfdCkfYk8pVezNvlkWhh0NHgFq/+qFLKFSa/q2 otz+wwWidkFSHfUj4CUGzJJMXvh/mWy+NZKDI+gzld25vKdKhZ2TkOsjqZljYl4qOo/v l7jwqmFbLweTcEgZ5Yxw3g7+PwFBY+ukFg9xO4dKAT2reVP4X1HKKb1g7/BqLpGqFM49 7NXcN1k2si/iuL2IrUwrb/1pmNEhUzTXFbu6XulCIA7j/NzByyvJbgmUi4IKUZL5PEnA z4zIiM1YzbZmlEE56cK7/4dAcjFeZ32IhbdBj+uvLanCTjAX8e7a16u5kkouhQv/+C9c 5Jzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082715; x=1697687515; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=IpeT682jXo8zHQtou6ZhNNKzbowV2y0VimL5UVLWBn8=; b=Q6O3eVVPCyN3IehfXfvgMz+RORUSJWBnZ0oJbLwn04L1Tj1jahV5sjhrZZmSehUqv7 lRnko1M3aXjI1yDqBn+sJ9+Yw9lAA56cD7ocXwGPiEm1Q97nBJdrWYPD/Dl/2kY3A8vA o+EBAk3izt/HDGGEGQg9phPt8SZKHMqcfkSoAmXF30R4kizGY/p6e5v8IjiyEe7SBRP4 vwmukVyO5GQU/E9mU9GDuvs4wvY4xLyp5PADQTvD2tgjOu9g7B6qjcVe3qx0BnEJvX0J ndpfzDc1VNEQNer7z2bkBmkKyJPD8GfWfRAszcQO/YoFJqQ6F+10uKXx+GugJ3JNvMmy Ys5Q== X-Gm-Message-State: AOJu0YzEYa1s3i6IfzTWhfPwjgiv6A4gw75Dh0h2JBd5SqxgkyFK7kLa OdNP++AaczFVylFhM/ptinhmf3dhGoo= X-Google-Smtp-Source: AGHT+IGObjCpXIRGXvxG2XeisCJpHRn7826etev5dtkSoj7pISBW7/J+f91nhwI9aX8oEoEP+26kOg== X-Received: by 2002:a17:902:c947:b0:1c7:4a8a:32d1 with SMTP id i7-20020a170902c94700b001c74a8a32d1mr24944285pla.28.1697082715545; Wed, 11 Oct 2023 20:51:55 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:55 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 31/48] perf dwarf-aux: Add die_get_cfa() Date: Wed, 11 Oct 2023 20:50:54 -0700 Message-ID: <20231012035111.676789-32-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The die_get_cfa() is to get frame base register and offset at the given instruction address (pc). This info will be used to locate stack variables which have location expression using DW_OP_fbreg. Signed-off-by: Namhyung Kim --- tools/perf/util/dwarf-aux.c | 64 +++++++++++++++++++++++++++++++++++++ tools/perf/util/dwarf-aux.h | 9 ++++++ 2 files changed, 73 insertions(+) diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index 97d9ae56350e..796413eb4e8f 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -1408,6 +1408,70 @@ Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die, Dwarf_Addr pc, return result; } +static int reg_from_dwarf_op(Dwarf_Op *op) +{ + switch (op->atom) { + case DW_OP_reg0 ... DW_OP_reg31: + return op->atom - DW_OP_reg0; + case DW_OP_breg0 ... DW_OP_breg31: + return op->atom - DW_OP_breg0; + case DW_OP_regx: + case DW_OP_bregx: + return op->number; + default: + break; + } + return -1; +} + +static int offset_from_dwarf_op(Dwarf_Op *op) +{ + switch (op->atom) { + case DW_OP_reg0 ... DW_OP_reg31: + case DW_OP_regx: + return 0; + case DW_OP_breg0 ... DW_OP_breg31: + return op->number; + case DW_OP_bregx: + return op->number2; + default: + break; + } + return -1; +} + +/** + * die_get_cfa - Get frame base information + * @dwarf: a Dwarf info + * @pc: program address + * @preg: pointer for saved register + * @poffset: pointer for saved offset + * + * This function gets register and offset for CFA (Canonical Frame Address) + * by searching the CIE/FDE info. The CFA usually points to the start address + * of the current stack frame and local variables can be located using an offset + * from the CFA. The @preg and @poffset will be updated if it returns 0. + */ +int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset) +{ + Dwarf_CFI *cfi; + Dwarf_Frame *frame = NULL; + Dwarf_Op *ops = NULL; + size_t nops; + + cfi = dwarf_getcfi(dwarf); + if (cfi == NULL) + return -1; + + if (!dwarf_cfi_addrframe(cfi, pc, &frame) && + !dwarf_frame_cfa(frame, &ops, &nops) && nops == 1) { + *preg = reg_from_dwarf_op(ops); + *poffset = offset_from_dwarf_op(ops); + return 0; + } + return -1; +} + #endif /* diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h index 742098e3ee7e..29a7243b1a45 100644 --- a/tools/perf/util/dwarf-aux.h +++ b/tools/perf/util/dwarf-aux.h @@ -149,6 +149,9 @@ Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die, Dwarf_Addr pc, Dwarf_Addr addr, Dwarf_Die *die_mem, int *offset); +/* Get the frame base information from CFA */ +int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset); + #else /* HAVE_DWARF_GETLOCATIONS_SUPPORT */ static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused, @@ -175,6 +178,12 @@ static inline Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die __maybe_unu return NULL; } +static inline int die_get_cfa(Dwarf *dwarf __maybe_unused, u64 pc __maybe_unused, + int *preg __maybe_unused, int *poffset __maybe_unused) +{ + return -1; +} + #endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */ #endif /* _DWARF_AUX_H */ From patchwork Thu Oct 12 03:50:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418241 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E67871370; Thu, 12 Oct 2023 03:52:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TCqxMtL/" Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F2CA10F5; Wed, 11 Oct 2023 20:51:58 -0700 (PDT) Received: by mail-pg1-x530.google.com with SMTP id 41be03b00d2f7-5859b2eaa55so390088a12.1; Wed, 11 Oct 2023 20:51:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082717; x=1697687517; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=P+PRT9OCPC1O//+Rr1ZGGVMfSA0ompzrBkf+48P2I2k=; b=TCqxMtL/FL/uNqsI2ZOOaetz0PoeeybsTxgvIZmUpLg/DxesxM3FTGm8wz9achY6DT NYixs+XKT+urXfY21jWtBja32TTJS9Pb4KVVKrGh2VdBkZKrGQWhEJ8A53F6bvvRjNFv 6XPlmlGFqrgTG8NzI+EXUWr57SfWe5EvLoBzd8fq5oA38Vq/Njv61VVHGM6IwsRi+y/A eqNkWBKT9j1bzoMMUFGLmCwAPiENbGMsZH/WNwSZ5h3rg0wwsBJEawLKGWJXO6CHTcud oegh0tHidUfclYz+e1A6kGGr37MYlL4T9Sm5Ah94jUNwJW1hywx3HPXGKefvLjkhhoNX S17Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082717; x=1697687517; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=P+PRT9OCPC1O//+Rr1ZGGVMfSA0ompzrBkf+48P2I2k=; b=HjR1ZzRd5/3LGmw86nnASmBmXsXUGCSFX+6t8jrmgTG7FAVnnlGBdEGxfGp5QwTyWB fjK0dAeABjqbkAm3gMw9eP93C8AzCDUuPjqUVuMbqmMBo1Z3TNmv8qPSI7XTsyWL+ezy xl2qCaU6LzHkVP2W9Bit0yF3vLGJ4f8EvnA+Bq6mRm4MvxxBXDO2mHKIPftlW5LrV4n0 NVyt71w9HJCXLFgqYPxe26wnbOlhy9PI4VWSSWBnsSddj6Acx7NYMDsNACIA3/eIdLIi EtDqcPFOTjBv9Sxas+2qUOzSkvIyWbZRUqCqa5VNdWm+oW4s/HngkAhDmovDselqzuzH /yWg== X-Gm-Message-State: AOJu0YwPG1OAsXklEPvp2wW/x1t//b8Te4W+VuDkn7BBMyaEi8i2Qin9 SOxrowAva1D16DINopQC2Ik= X-Google-Smtp-Source: AGHT+IHq7Fp4XmHZwcP6ftgkfHGCtOVU/Z2ILrrSv7VHIfkH1Nd6f0JhquAGcLDwMbB7beeH1ZwjSQ== X-Received: by 2002:a05:6a20:9381:b0:14d:9938:735f with SMTP id x1-20020a056a20938100b0014d9938735fmr23770088pzh.17.1697082716845; Wed, 11 Oct 2023 20:51:56 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:56 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 32/48] perf annotate-data: Support stack variables Date: Wed, 11 Oct 2023 20:50:55 -0700 Message-ID: <20231012035111.676789-33-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Local variables are allocated in the stack and the location list should look like base register(s) and an offset. Extend the die_find_variable_by_reg() to handle the following expressions * DW_OP_breg{0..31} * DW_OP_bregx * DW_OP_fbreg Ususally DWARF subprogram entries have frame base information and use it to locate stack variable like below: <2><43d1575>: Abbrev Number: 62 (DW_TAG_variable) <43d1576> DW_AT_location : 2 byte block: 91 7c (DW_OP_fbreg: -4) <--- here <43d1579> DW_AT_name : (indirect string, offset: 0x2c00c9): i <43d157d> DW_AT_decl_file : 1 <43d157e> DW_AT_decl_line : 78 <43d157f> DW_AT_type : <0x43d19d7> I found some differences on saving the frame base between gcc and clang. The gcc uses the CFA to get the base so it needs to check the current frame's CFI info. In this case, stack offset needs to be adjusted from the start of the CFA. <1><1bb8d>: Abbrev Number: 102 (DW_TAG_subprogram) <1bb8e> DW_AT_name : (indirect string, offset: 0x74d41): kernel_init <1bb92> DW_AT_decl_file : 2 <1bb92> DW_AT_decl_line : 1440 <1bb94> DW_AT_decl_column : 18 <1bb95> DW_AT_prototyped : 1 <1bb95> DW_AT_type : <0xcc> <1bb99> DW_AT_low_pc : 0xffffffff81bab9e0 <1bba1> DW_AT_high_pc : 0x1b2 <1bba9> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) <------ here <1bbab> DW_AT_call_all_calls: 1 <1bbab> DW_AT_sibling : <0x1bf5a> While clang sets it to a register directly and it can check the register and offset in the instruction directly. <1><43d1542>: Abbrev Number: 60 (DW_TAG_subprogram) <43d1543> DW_AT_low_pc : 0xffffffff816a7c60 <43d154b> DW_AT_high_pc : 0x98 <43d154f> DW_AT_frame_base : 1 byte block: 56 (DW_OP_reg6 (rbp)) <---------- here <43d1551> DW_AT_GNU_all_call_sites: 1 <43d1551> DW_AT_name : (indirect string, offset: 0x3bce91): foo <43d1555> DW_AT_decl_file : 1 <43d1556> DW_AT_decl_line : 75 <43d1557> DW_AT_prototyped : 1 <43d1557> DW_AT_type : <0x43c7332> <43d155b> DW_AT_external : 1 Also it needs to update the offset after finding the type like global variables since the offset was from the frame base. Factor out match_var_offset() to check global and local variables in the same way. The type stats are improved too: Annotate data type stats: total 294, ok 160 (54.4%), bad 134 (45.6%) ----------------------------------------------------------- 30 : no_sym 32 : no_mem_ops 51 : no_var 14 : no_typeinfo 7 : bad_offset Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 35 +++++++++++++-- tools/perf/util/dwarf-aux.c | 79 ++++++++++++++++++++++++--------- tools/perf/util/dwarf-aux.h | 3 ++ 3 files changed, 93 insertions(+), 24 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 857e2fbe83f2..39bbd56b2160 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -226,7 +226,7 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset, /* * Usually it expects a pointer type for a memory access. * Convert to a real type it points to. But global variables - * are accessed directly without a pointer. + * and local variables are accessed directly without a pointer. */ if (is_pointer) { if ((dwarf_tag(type_die) != DW_TAG_pointer_type && @@ -265,6 +265,9 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, int reg, offset; int ret = -1; int i, nr_scopes; + int fbreg = -1; + bool is_fbreg = false; + int fb_offset = 0; /* Get a compile_unit for this address */ if (!find_cu_die(di, pc, &cu_die)) { @@ -296,7 +299,33 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, /* Get a list of nested scopes - i.e. (inlined) functions and blocks. */ nr_scopes = die_get_scopes(&cu_die, pc, &scopes); + if (reg != DWARF_REG_PC && dwarf_hasattr(&scopes[0], DW_AT_frame_base)) { + Dwarf_Attribute attr; + Dwarf_Block block; + + /* Check if the 'reg' is assigned as frame base register */ + if (dwarf_attr(&scopes[0], DW_AT_frame_base, &attr) != NULL && + dwarf_formblock(&attr, &block) == 0 && block.length == 1) { + switch (*block.data) { + case DW_OP_reg0 ... DW_OP_reg31: + fbreg = *block.data - DW_OP_reg0; + break; + case DW_OP_call_frame_cfa: + if (die_get_cfa(di->dbg, pc, &fbreg, + &fb_offset) < 0) + fbreg = -1; + break; + default: + break; + } + } + } + retry: + is_fbreg = (reg == fbreg); + if (is_fbreg) + offset = loc->offset - fb_offset; + /* Search from the inner-most scope to the outer */ for (i = nr_scopes - 1; i >= 0; i--) { if (reg == DWARF_REG_PC) { @@ -306,13 +335,13 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, } else { /* Look up variables/parameters in this scope */ if (!die_find_variable_by_reg(&scopes[i], pc, reg, - &var_die)) + &offset, is_fbreg, &var_die)) continue; } /* Found a variable, see if it's correct */ ret = check_variable(&var_die, type_die, offset, - reg != DWARF_REG_PC); + reg != DWARF_REG_PC && !is_fbreg); loc->offset = offset; goto out; } diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index 796413eb4e8f..7f3822d08ab7 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -1272,11 +1272,39 @@ struct find_var_data { unsigned reg; /* Access offset, set for global data */ int offset; + /* True if the current register is the frame base */ + bool is_fbreg; }; /* Max number of registers DW_OP_regN supports */ #define DWARF_OP_DIRECT_REGS 32 +static bool match_var_offset(Dwarf_Die *die_mem, struct find_var_data *data, + u64 addr_offset, u64 addr_type) +{ + Dwarf_Die type_die; + Dwarf_Word size; + + if (addr_offset == addr_type) { + /* Update offset relative to the start of the variable */ + data->offset = 0; + return true; + } + + if (die_get_real_type(die_mem, &type_die) == NULL) + return false; + + if (dwarf_aggregate_size(&type_die, &size) < 0) + return false; + + if (addr_offset >= addr_type + size) + return false; + + /* Update offset relative to the start of the variable */ + data->offset = addr_offset - addr_type; + return true; +} + /* Only checks direct child DIEs in the given scope. */ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg) { @@ -1301,14 +1329,30 @@ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg) if (start > data->pc) break; + /* Local variables accessed using frame base register */ + if (data->is_fbreg && ops->atom == DW_OP_fbreg && + data->offset >= (int)ops->number && + match_var_offset(die_mem, data, data->offset, ops->number)) + return DIE_FIND_CB_END; + /* Only match with a simple case */ if (data->reg < DWARF_OP_DIRECT_REGS) { if (ops->atom == (DW_OP_reg0 + data->reg) && nops == 1) return DIE_FIND_CB_END; + + /* Local variables accessed by a register + offset */ + if (ops->atom == (DW_OP_breg0 + data->reg) && + match_var_offset(die_mem, data, data->offset, ops->number)) + return DIE_FIND_CB_END; } else { if (ops->atom == DW_OP_regx && ops->number == data->reg && nops == 1) return DIE_FIND_CB_END; + + /* Local variables accessed by a register + offset */ + if (ops->atom == DW_OP_bregx && data->reg == ops->number && + match_var_offset(die_mem, data, data->offset, ops->number2)) + return DIE_FIND_CB_END; } } return DIE_FIND_CB_SIBLING; @@ -1319,18 +1363,29 @@ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg) * @sc_die: a scope DIE * @pc: the program address to find * @reg: the register number to find + * @poffset: pointer to offset, will be updated for fbreg case + * @is_fbreg: boolean value if the current register is the frame base * @die_mem: a buffer to save the resulting DIE * - * Find the variable DIE accessed by the given register. + * Find the variable DIE accessed by the given register. It'll update the @offset + * when the variable is in the stack. */ Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg, + int *poffset, bool is_fbreg, Dwarf_Die *die_mem) { struct find_var_data data = { .pc = pc, .reg = reg, + .offset = *poffset, + .is_fbreg = is_fbreg, }; - return die_find_child(sc_die, __die_find_var_reg_cb, &data, die_mem); + Dwarf_Die *result; + + result = die_find_child(sc_die, __die_find_var_reg_cb, &data, die_mem); + if (result) + *poffset = data.offset; + return result; } /* Only checks direct child DIEs in the given scope */ @@ -1341,8 +1396,6 @@ static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg) ptrdiff_t off = 0; Dwarf_Attribute attr; Dwarf_Addr base, start, end; - Dwarf_Word size; - Dwarf_Die type_die; Dwarf_Op *ops; size_t nops; @@ -1359,24 +1412,8 @@ static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg) if (data->addr < ops->number) continue; - if (data->addr == ops->number) { - /* Update offset relative to the start of the variable */ - data->offset = 0; + if (match_var_offset(die_mem, data, data->addr, ops->number)) return DIE_FIND_CB_END; - } - - if (die_get_real_type(die_mem, &type_die) == NULL) - continue; - - if (dwarf_aggregate_size(&type_die, &size) < 0) - continue; - - if (data->addr >= ops->number + size) - continue; - - /* Update offset relative to the start of the variable */ - data->offset = data->addr - ops->number; - return DIE_FIND_CB_END; } return DIE_FIND_CB_SIBLING; } diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h index 29a7243b1a45..dc7e98678216 100644 --- a/tools/perf/util/dwarf-aux.h +++ b/tools/perf/util/dwarf-aux.h @@ -142,6 +142,7 @@ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf); /* Find a variable saved in the 'reg' at given address */ Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg, + int *poffset, bool is_fbreg, Dwarf_Die *die_mem); /* Find a (global) variable located in the 'addr' */ @@ -164,6 +165,8 @@ static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused, static inline Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die __maybe_unused, Dwarf_Addr pc __maybe_unused, int reg __maybe_unused, + int *poffset __maybe_unused, + bool is_fbreg __maybe_unused, Dwarf_Die *die_mem __maybe_unused) { return NULL; From patchwork Thu Oct 12 03:50:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418240 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 97E051368; Thu, 12 Oct 2023 03:52:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jeNudOb1" Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0881B10FE; Wed, 11 Oct 2023 20:51:59 -0700 (PDT) Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1c9daca2b85so3889955ad.1; Wed, 11 Oct 2023 20:51:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082718; x=1697687518; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=l0uukPGLNEM1tQxhVgCDG+pFph2Ypn/ZDoH5IEqzJ3I=; b=jeNudOb1UFmXpW89hxAkI3WSiEB6zlvZNbAm5Que5kBkvID8hDLLLKCOiQLA/hBXfS NCbWDJrizyD2dHCeQ9TaQlO78OWvFi7tvOvpd9kBdLPJ24N0d3wtq0nCl3NwvXG5oS+7 hSmqaTQ6r8mZw67GNIlKGMbSVcB+iJUIGXiAVayU9q6eDu22XDNxExiJNVwrlUpfD/Id P4XYlZS3fSwTbXx7CPBz7Q0rO13TRIHfVDBngzD4pm4/xJ7YF01yEkjIioiJdI/B1pzc IVh5kQPiJgN2ViKeI6zPO7oxdYABRC6xL2cuDxi+WwO/0bGalc3G8pnoQa5PmrHs/YZY ROeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082718; x=1697687518; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=l0uukPGLNEM1tQxhVgCDG+pFph2Ypn/ZDoH5IEqzJ3I=; b=wA71cMn91nK+ya4aySVgXJIYw3kwqWv/EGQ5rko8RspoDU9qHsG+zBPHCHG50tVkuI r9ggxOFXyCsLE3/FUsHPPqjFqBSWE0mes6x0Ev76gVEVCCm0l2PvkVqOL0Bp2wNc7iDR rlM4kutFSB6PjI3nqZN3rGfQwSXN3zykiGEYJKKkw4OIPGrv7vVZAGuiuoIfcg5B9c4Q NOedrqi042YAqGQFicjFMhRRAGigD75xS1pu1cLzaSsYHHcO5weu0+giDz9uiPndbnDc nTEEUMGCIPQdBzRp/gunfgplHl6e1VBDVYjFFOXpiZTjoucd+KNG9kiuvy601CR2zIOl uPfA== X-Gm-Message-State: AOJu0YylSRkGZ+VNS1rrKKcgQDvlot2EOUqJWwLWyPwxG9vzP9ER2OMT 8TDnJy36r2FOlL2bDveRAoM= X-Google-Smtp-Source: AGHT+IEclYaWjsPPKcKoo/WLvGVTPPpcj9luv3OfJd0T5zaVimdTR2Sng9+6xF99eXfZeJP14Uworg== X-Received: by 2002:a17:902:e852:b0:1c5:59dc:6e93 with SMTP id t18-20020a170902e85200b001c559dc6e93mr34503545plg.3.1697082718091; Wed, 11 Oct 2023 20:51:58 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:57 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 33/48] perf dwarf-aux: Check allowed DWARF Ops Date: Wed, 11 Oct 2023 20:50:56 -0700 Message-ID: <20231012035111.676789-34-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The DWARF location expression can be fairly complex and it'd be hard to match it with the condition correctly. So let's be conservative and only allow simple expressions. For now it just checks the first operation in the list. The following operations looks ok: * DW_OP_stack_value * DW_OP_deref_size * DW_OP_deref * DW_OP_piece To refuse complex (and unsupported) location expressions, add check_allowed_ops() to compare the rest of the list. It seems earlier result contained those unsupported expressions. For example, I found some local struct variable is placed like below. <2><43d1517>: Abbrev Number: 62 (DW_TAG_variable) <43d1518> DW_AT_location : 15 byte block: 91 50 93 8 91 78 93 4 93 84 8 91 68 93 4 (DW_OP_fbreg: -48; DW_OP_piece: 8; DW_OP_fbreg: -8; DW_OP_piece: 4; DW_OP_piece: 1028; DW_OP_fbreg: -24; DW_OP_piece: 4) Another example is something like this. 0057c8be ffffffffffffffff ffffffff812109f0 (base address) 0057c8ce ffffffff812112b5 ffffffff812112c8 (DW_OP_breg3 (rbx): 0; DW_OP_constu: 18446744073709551612; DW_OP_and; DW_OP_stack_value) It should refuse them. After the change, the stat shows: Annotate data type stats: total 294, ok 158 (53.7%), bad 136 (46.3%) ----------------------------------------------------------- 30 : no_sym 32 : no_mem_ops 53 : no_var 14 : no_typeinfo 7 : bad_offset Signed-off-by: Namhyung Kim Acked-by: Masami Hiramatsu (Google) --- tools/perf/util/dwarf-aux.c | 44 +++++++++++++++++++++++++++++++++---- 1 file changed, 40 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index 7f3822d08ab7..093d7e82b333 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -1305,6 +1305,34 @@ static bool match_var_offset(Dwarf_Die *die_mem, struct find_var_data *data, return true; } +static bool check_allowed_ops(Dwarf_Op *ops, size_t nops) +{ + /* The first op is checked separately */ + ops++; + nops--; + + /* + * It needs to make sure if the location expression matches to the given + * register and offset exactly. Thus it rejects any complex expressions + * and only allows a few of selected operators that doesn't change the + * location. + */ + while (nops) { + switch (ops->atom) { + case DW_OP_stack_value: + case DW_OP_deref_size: + case DW_OP_deref: + case DW_OP_piece: + break; + default: + return false; + } + ops++; + nops--; + } + return true; +} + /* Only checks direct child DIEs in the given scope. */ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg) { @@ -1332,25 +1360,31 @@ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg) /* Local variables accessed using frame base register */ if (data->is_fbreg && ops->atom == DW_OP_fbreg && data->offset >= (int)ops->number && + check_allowed_ops(ops, nops) && match_var_offset(die_mem, data, data->offset, ops->number)) return DIE_FIND_CB_END; /* Only match with a simple case */ if (data->reg < DWARF_OP_DIRECT_REGS) { - if (ops->atom == (DW_OP_reg0 + data->reg) && nops == 1) + /* pointer variables saved in a register 0 to 31 */ + if (ops->atom == (DW_OP_reg0 + data->reg) && + check_allowed_ops(ops, nops)) return DIE_FIND_CB_END; /* Local variables accessed by a register + offset */ if (ops->atom == (DW_OP_breg0 + data->reg) && + check_allowed_ops(ops, nops) && match_var_offset(die_mem, data, data->offset, ops->number)) return DIE_FIND_CB_END; } else { + /* pointer variables saved in a register 32 or above */ if (ops->atom == DW_OP_regx && ops->number == data->reg && - nops == 1) + check_allowed_ops(ops, nops)) return DIE_FIND_CB_END; /* Local variables accessed by a register + offset */ if (ops->atom == DW_OP_bregx && data->reg == ops->number && + check_allowed_ops(ops, nops) && match_var_offset(die_mem, data, data->offset, ops->number2)) return DIE_FIND_CB_END; } @@ -1412,7 +1446,8 @@ static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg) if (data->addr < ops->number) continue; - if (match_var_offset(die_mem, data, data->addr, ops->number)) + if (check_allowed_ops(ops, nops) && + match_var_offset(die_mem, data, data->addr, ops->number)) return DIE_FIND_CB_END; } return DIE_FIND_CB_SIBLING; @@ -1501,7 +1536,8 @@ int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset) return -1; if (!dwarf_cfi_addrframe(cfi, pc, &frame) && - !dwarf_frame_cfa(frame, &ops, &nops) && nops == 1) { + !dwarf_frame_cfa(frame, &ops, &nops) && + check_allowed_ops(ops, nops)) { *preg = reg_from_dwarf_op(ops); *poffset = offset_from_dwarf_op(ops); return 0; From patchwork Thu Oct 12 03:50:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418242 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 162271373; Thu, 12 Oct 2023 03:52:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="iRapRysN" Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6ADBFE8; Wed, 11 Oct 2023 20:52:00 -0700 (PDT) Received: by mail-pg1-x535.google.com with SMTP id 41be03b00d2f7-584a761b301so415664a12.3; Wed, 11 Oct 2023 20:52:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082719; x=1697687519; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=qpkTPPnVT3dzSfLyQxD7kyX6iWqu+XT6fM/HnJbBp0M=; b=iRapRysNegmLAW1kiMHDQ4vuiqCWLikuCH0jTI1/dsgffAgGr3rXkHUV4VlrFdpag+ JDdTYuM6LINFSSyGxqUNJ5LBfM4kpPOMiIbclYN+T7Jz9NO3pPZNCDOYypR7ZVy/YJ/k +DOtJvofpNpUY/lxVowM7blG9+ED2OwkUs18DWbPtS9pO0QTvfCDmT3j2y32nQ3jpN7H 76522aVSORHXN4IKb3vyuxgTGXo9rA1hYmX1jX9t8q7Hb2X8/uhaawOZlcVj+2OzfxMV +x5l7APhZEJHQ3HV6f8MUrgBiJfDEAcRYoL1q+mbcTU+5AN+1c9o6ojnbBhMhs7rcVQz 8TXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082719; x=1697687519; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=qpkTPPnVT3dzSfLyQxD7kyX6iWqu+XT6fM/HnJbBp0M=; b=ly6KmVDfC9RYV6JtT78xmxnpwuvpQsDXpBU9T86grKWhOvB6UWD4Ku6fDrqIhsbbuE qX++g7BmyJ7UXokY7JwYUGR5RbFOX3zoaHu/kjGiY9+jLdkVRwf/pI1izFwLVvP9rHoo CHhsjtWD9+h1jpidjl5AZ/1AsNr0yWfShHE18HHwfKtfdohMrBC05Gmk7IccpqkbwCWS HdIHxxddIJ507EqNnvIj8zF8BC+iY+OE73ovwAlE8caUalQgAgZqbFyNiu/s8zn+5SOv QDLAg2Qh8WQUbfK0an4+r56/4RyaalZZQDr00dASJQ54hqmVa9z6cjhhsANzJhu98Idv KYMQ== X-Gm-Message-State: AOJu0YzirwGlzgFTO5dgdsRlFscz6OV+jM+8ryHU4tMgarqoBxdiDNTq z6R2KJV5qHhtQpp7sf6gQFk= X-Google-Smtp-Source: AGHT+IHRPDCm4a1A17Vsd5OLn71LjwNXJRyjzHslksOAB3XteLeIMZ230y1oM0Ecxkq4Xj7r6Arg2A== X-Received: by 2002:a05:6a20:6a0c:b0:13a:59b1:c884 with SMTP id p12-20020a056a206a0c00b0013a59b1c884mr25517428pzk.40.1697082719385; Wed, 11 Oct 2023 20:51:59 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:51:59 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 34/48] perf dwarf-aux: Add die_collect_vars() Date: Wed, 11 Oct 2023 20:50:57 -0700 Message-ID: <20231012035111.676789-35-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The die_collect_vars() is to find all variable information in the scope including function parameters. The struct die_var_type is to save the type of the variable with the location (reg and offset) as well as where it's defined in the code (addr). Signed-off-by: Namhyung Kim Acked-by: Masami Hiramatsu (Google) --- tools/perf/util/dwarf-aux.c | 60 +++++++++++++++++++++++++++++++++++++ tools/perf/util/dwarf-aux.h | 17 +++++++++++ 2 files changed, 77 insertions(+) diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index 093d7e82b333..16e63d8caf83 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -1545,6 +1545,66 @@ int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset) return -1; } +static int __die_collect_vars_cb(Dwarf_Die *die_mem, void *arg) +{ + struct die_var_type **var_types = arg; + Dwarf_Die type_die; + int tag = dwarf_tag(die_mem); + Dwarf_Attribute attr; + Dwarf_Addr base, start, end; + Dwarf_Op *ops; + size_t nops; + struct die_var_type *vt; + + if (tag != DW_TAG_variable && tag != DW_TAG_formal_parameter) + return DIE_FIND_CB_SIBLING; + + if (dwarf_attr(die_mem, DW_AT_location, &attr) == NULL) + return DIE_FIND_CB_SIBLING; + + /* + * Only collect the first location as it can reconstruct the + * remaining state by following the instructions. + * start = 0 means it covers the whole range. + */ + if (dwarf_getlocations(&attr, 0, &base, &start, &end, &ops, &nops) <= 0) + return DIE_FIND_CB_SIBLING; + + if (die_get_real_type(die_mem, &type_die) == NULL) + return DIE_FIND_CB_SIBLING; + + vt = malloc(sizeof(*vt)); + if (vt == NULL) + return DIE_FIND_CB_END; + + vt->die_off = dwarf_dieoffset(&type_die); + vt->addr = start; + vt->reg = reg_from_dwarf_op(ops); + vt->offset = offset_from_dwarf_op(ops); + vt->next = *var_types; + *var_types = vt; + + return DIE_FIND_CB_SIBLING; +} + +/** + * die_collect_vars - Save all variables and parameters + * @sc_die: a scope DIE + * @var_types: a pointer to save the resulting list + * + * Save all variables and parameters in the @sc_die and save them to @var_types. + * The @var_types is a singly-linked list containing type and location info. + * Actual type can be retrieved using dwarf_offdie() with 'die_off' later. + * + * Callers should free @var_types. + */ +void die_collect_vars(Dwarf_Die *sc_die, struct die_var_type **var_types) +{ + Dwarf_Die die_mem; + + die_find_child(sc_die, __die_collect_vars_cb, (void *)var_types, &die_mem); +} + #endif /* diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h index dc7e98678216..d0ef41738abd 100644 --- a/tools/perf/util/dwarf-aux.h +++ b/tools/perf/util/dwarf-aux.h @@ -135,6 +135,15 @@ void die_skip_prologue(Dwarf_Die *sp_die, Dwarf_Die *cu_die, /* Get the list of including scopes */ int die_get_scopes(Dwarf_Die *cu_die, Dwarf_Addr pc, Dwarf_Die **scopes); +/* Variable type information */ +struct die_var_type { + struct die_var_type *next; + u64 die_off; + u64 addr; + int reg; + int offset; +}; + #ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT /* Get byte offset range of given variable DIE */ @@ -153,6 +162,9 @@ Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die, Dwarf_Addr pc, /* Get the frame base information from CFA */ int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset); +/* Save all variables and parameters in this scope */ +void die_collect_vars(Dwarf_Die *sc_die, struct die_var_type **var_types); + #else /* HAVE_DWARF_GETLOCATIONS_SUPPORT */ static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused, @@ -187,6 +199,11 @@ static inline int die_get_cfa(Dwarf *dwarf __maybe_unused, u64 pc __maybe_unused return -1; } +static inline void die_collect_vars(Dwarf_Die *sc_die __maybe_unused, + struct die_var_type **var_types __maybe_unused) +{ +} + #endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */ #endif /* _DWARF_AUX_H */ From patchwork Thu Oct 12 03:50:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418247 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0129817E4; Thu, 12 Oct 2023 03:52:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lJiWKU8A" Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC69B1701; Wed, 11 Oct 2023 20:52:02 -0700 (PDT) Received: by mail-pg1-x535.google.com with SMTP id 41be03b00d2f7-577fff1cae6so371555a12.1; Wed, 11 Oct 2023 20:52:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082721; x=1697687521; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=fScxRg9rCJRGXzRKFXZYIJ1+B/7O2DI4+C7sBTfd7FY=; b=lJiWKU8AtS4NCexm+dP94oZiBZ/Q4xH5lY3ICVkrpqZ+JXlq4RLbbkuqAo8u7OipsI hwY4j48BIH/NxjWLnizb8KPPQSp7pCXJK5u7v9L6o07spCCy3OHUX2EHpGOEc0snl91G l2WL3LYhsZIGQ4mb7rX7f4fxYqTlAagRtnwjIwpvsB2XVyU3FJbNfbiAB1QOcIl0wBaw ZeUXTlNow7LSxHhhUM0XgAQX26B1Y8/5fDrlhqawLqiWhWT4CtMP2PJXQSgYeQAeuyHO AMmGh4kV7lWZoC/z24z5wJ9k4+gKGD6Kky8VCgoHvYDFwVPvNNiFTh1/xk+iFcRLLPlT em6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082721; x=1697687521; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=fScxRg9rCJRGXzRKFXZYIJ1+B/7O2DI4+C7sBTfd7FY=; b=Iosg88YQSs5GB5bJ+j7P/vVY44ajLhlPVuc2CFr4sKeI9iRlQ5YdFgNsiNQzbQkUUn eVZOnxFvUQSWBmyjnyXw2lqm7MCcoboD/nSUoWOJS+QWOqtgzAFDSZax3BKQc4MKqPB+ TnANHai0A9axMmbf8enwPUqL8r+kYt8nVtkoWP+Lqnkl+Hm9EVyEi4i/r2HPU8Aq06ql 2G+ryIV+HlrnFegXTSQjCg18UnVjoUtmnHFBuLnbDAliHT0aLgJfZDazgSCGOmMUQGn9 VWxVSNRkxFSVU2kM57OT7Dg9EwevCLAIakkDTfXGvzP7Dp00UVu173WPDSpEX0ztFKbH QgSQ== X-Gm-Message-State: AOJu0Yxs/HeMcAqEkqAeqwPaVH9YEfl9Dns/OfcFwx6e3Uf7OmM7wXdz Aoc6AY2BRGEBsV9JQRVQ4Eo= X-Google-Smtp-Source: AGHT+IHlhdGSnXOOgE8Fj1pHBsqk3Wj5iiDiVTxZw1yj2XVcJ2yWDY1lVsJdvA1R+hqunadrwDVrcQ== X-Received: by 2002:a17:903:2352:b0:1b8:8682:62fb with SMTP id c18-20020a170903235200b001b8868262fbmr30826465plh.4.1697082720631; Wed, 11 Oct 2023 20:52:00 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.51.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:00 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 35/48] perf dwarf-aux: Handle type transfer for memory access Date: Wed, 11 Oct 2023 20:50:58 -0700 Message-ID: <20231012035111.676789-36-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net We want to track type states as instructions are executed. Each instruction can access compound types like struct or union and load/ store its members to a different location. The die_deref_ptr_type() is to find a type of memory access with a pointer variable. If it points to a compound type like struct, the target memory is a member in the struct. The access will happen with an offset indicating which member it refers. Let's follow the DWARF info to figure out the type of the pointer target. For example, say we have the following code. struct foo { int a; int b; }; struct foo *p = malloc(sizeof(*p)); p->b = 0; The last pointer access should produce x86 asm like below: mov 0x0, 4(%rbx) And we know %rbx register has a pointer to struct foo. Then offset 4 should return the debug info of member 'b'. Also variables of compound types can be accessed directly without a pointer. The die_get_member_type() is to handle a such case. Signed-off-by: Namhyung Kim Acked-by: Masami Hiramatsu (Google) --- tools/perf/util/dwarf-aux.c | 110 ++++++++++++++++++++++++++++++++++++ tools/perf/util/dwarf-aux.h | 6 ++ 2 files changed, 116 insertions(+) diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index 16e63d8caf83..5ec895e0a069 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -1838,3 +1838,113 @@ int die_get_scopes(Dwarf_Die *cu_die, Dwarf_Addr pc, Dwarf_Die **scopes) *scopes = data.scopes; return data.nr; } + +static int __die_find_member_offset_cb(Dwarf_Die *die_mem, void *arg) +{ + Dwarf_Die type_die; + Dwarf_Word size, loc; + Dwarf_Word offset = (long)arg; + int tag = dwarf_tag(die_mem); + + if (tag != DW_TAG_member) + return DIE_FIND_CB_SIBLING; + + /* Unions might not have location */ + if (die_get_data_member_location(die_mem, &loc) < 0) + loc = 0; + + if (offset == loc) + return DIE_FIND_CB_END; + + die_get_real_type(die_mem, &type_die); + + if (dwarf_aggregate_size(&type_die, &size) < 0) + size = 0; + + if (loc < offset && offset < (loc + size)) + return DIE_FIND_CB_END; + + return DIE_FIND_CB_SIBLING; +} + +/** + * die_get_member_type - Return type info of struct member + * @type_die: a type DIE + * @offset: offset in the type + * @die_mem: a buffer to save the resulting DIE + * + * This function returns a type of a member in @type_die where it's located at + * @offset if it's a struct. For now, it just returns the first matching + * member in a union. For other types, it'd return the given type directly + * if it's within the size of the type or NULL otherwise. + */ +Dwarf_Die *die_get_member_type(Dwarf_Die *type_die, int offset, + Dwarf_Die *die_mem) +{ + Dwarf_Die *member; + Dwarf_Die mb_type; + int tag; + + tag = dwarf_tag(type_die); + /* If it's not a compound type, return the type directly */ + if (tag != DW_TAG_structure_type && tag != DW_TAG_union_type) { + Dwarf_Word size; + + if (dwarf_aggregate_size(type_die, &size) < 0) + size = 0; + + if ((unsigned)offset >= size) + return NULL; + + *die_mem = *type_die; + return die_mem; + } + + mb_type = *type_die; + /* TODO: Handle union types better? */ + while (tag == DW_TAG_structure_type || tag == DW_TAG_union_type) { + member = die_find_child(&mb_type, __die_find_member_offset_cb, + (void *)(long)offset, die_mem); + if (member == NULL) + return NULL; + + if (die_get_real_type(member, &mb_type) == NULL) + return NULL; + + tag = dwarf_tag(&mb_type); + + if (tag == DW_TAG_structure_type || tag == DW_TAG_union_type) { + Dwarf_Word loc; + + /* Update offset for the start of the member struct */ + if (die_get_data_member_location(member, &loc) == 0) + offset -= loc; + } + } + *die_mem = mb_type; + return die_mem; +} + +/** + * die_deref_ptr_type - Return type info for pointer access + * @ptr_die: a pointer type DIE + * @offset: access offset for the pointer + * @die_mem: a buffer to save the resulting DIE + * + * This function follows the pointer in @ptr_die with given @offset + * and saves the resulting type in @die_mem. If the pointer points + * a struct type, actual member at the offset would be returned. + */ +Dwarf_Die *die_deref_ptr_type(Dwarf_Die *ptr_die, int offset, + Dwarf_Die *die_mem) +{ + Dwarf_Die type_die; + + if (dwarf_tag(ptr_die) != DW_TAG_pointer_type) + return NULL; + + if (die_get_real_type(ptr_die, &type_die) == NULL) + return NULL; + + return die_get_member_type(&type_die, offset, die_mem); +} diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h index d0ef41738abd..df846bd30134 100644 --- a/tools/perf/util/dwarf-aux.h +++ b/tools/perf/util/dwarf-aux.h @@ -144,6 +144,12 @@ struct die_var_type { int offset; }; +/* Return type info of a member at offset */ +Dwarf_Die *die_get_member_type(Dwarf_Die *type_die, int offset, Dwarf_Die *die_mem); + +/* Return type info where the pointer and offset point to */ +Dwarf_Die *die_deref_ptr_type(Dwarf_Die *ptr_die, int offset, Dwarf_Die *die_mem); + #ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT /* Get byte offset range of given variable DIE */ From patchwork Thu Oct 12 03:50:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418245 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A19C21C2D; Thu, 12 Oct 2023 03:52:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SDNCGD6Z" Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2AF54170A; Wed, 11 Oct 2023 20:52:02 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1c9b1e3a809so4210645ad.2; Wed, 11 Oct 2023 20:52:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082722; x=1697687522; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=QbNhx62mg3juSg8RKm4fZhibXLOkoCp5ZhYwVrAiyrk=; b=SDNCGD6Z8YSZP5/jdUJ5yPabXtGCym2OmFjSqx/xBRMhl1VBaJAuoIUSTqkUMP2+1w gZga+0/WK+cPL6LN1xsBYQCay0+ZiZ0gg2A+gg8k1o0iKUx/5I4USBCb55sW1ypIQboj mP8SvKbJaGBbfhhItRderUyR5A/q3NoeKweUKFVpOrl2Tsam+C+mi88le2hTWgvM2snE 7XIAGOsgsi7OVxK7gJZNK5d0OsWYP22r0gAbs8KOfGOwYrX2XE+0FeQBM8BRiMHYsQFl 9e8s3Gbt+R51vYk508d/df138LGC9DfKUoDCw6mIuAuOXQuuETTy6n5ossagmaMLa3Uy SpHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082722; x=1697687522; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=QbNhx62mg3juSg8RKm4fZhibXLOkoCp5ZhYwVrAiyrk=; b=UuSyDLHnoFz45KfE38uVLz6SJBOfBl9orIXhRn/sCR0iAFWr7AoQbQzGo3vQbC7LRw q7OP1fTwe7Vxh4wxpfmC4Nl3mD5Rb2WJbOl28oVcv6eSpH/H+eyE3mjXFqE57yOCxUHV 5zt3a5dwrZ7bhn7sRDq7sCbUr6Z/k4LgWDZA/x5gzFXqV89+As4dZZkdzXA+ZdNaZmnH Yc813nJscGjrGhIJDp6KQQ+IeGC8lwYtCf/UPuFn1SyXa+wUlhG8sE/XtkFUfwCRZ5KA vQz44/1QxPjgZUSy2hQ/vp6+R5lBoFsavWS7vObr8zfKC1cmHJAOKGuQH84obJzRvw/v 67Vw== X-Gm-Message-State: AOJu0YwgesxYPmcbz9yueC806G9eqa+AXtWYiMffzO30Z0YUnguenGnW f1FGKG9FWg1oqdRMcVBhK4s= X-Google-Smtp-Source: AGHT+IHyazLbAAQ10RSjnGoc93S6UYpNyB3YIHV2vctgxFaPr2zkg/MZhi4HpqYeHLFvIyF/x+HQlw== X-Received: by 2002:a17:903:2286:b0:1c5:d8a3:8789 with SMTP id b6-20020a170903228600b001c5d8a38789mr25769463plh.4.1697082721970; Wed, 11 Oct 2023 20:52:01 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:01 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 36/48] perf annotate-data: Introduce struct data_loc_info Date: Wed, 11 Oct 2023 20:50:59 -0700 Message-ID: <20231012035111.676789-37-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The find_data_type() needs many information to describe the location of the data. Add the new struct data_loc_info to pass those information at once. No functional changes intended. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 83 +++++++++++++++++---------------- tools/perf/util/annotate-data.h | 38 ++++++++++++--- tools/perf/util/annotate.c | 30 ++++++------ 3 files changed, 91 insertions(+), 60 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 39bbd56b2160..90793cbb6aa0 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -256,21 +256,28 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset, } /* The result will be saved in @type_die */ -static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, - const char *var_name, struct annotated_op_loc *loc, - Dwarf_Die *type_die) +static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die) { + struct annotated_op_loc *loc = dloc->op; Dwarf_Die cu_die, var_die; Dwarf_Die *scopes = NULL; int reg, offset; int ret = -1; int i, nr_scopes; int fbreg = -1; - bool is_fbreg = false; int fb_offset = 0; + bool is_fbreg = false; + u64 pc; + + /* + * IP is a relative instruction address from the start of the map, as + * it can be randomized/relocated, it needs to translate to PC which is + * a file address for DWARF processing. + */ + pc = map__rip_2objdump(dloc->ms->map, dloc->ip); /* Get a compile_unit for this address */ - if (!find_cu_die(di, pc, &cu_die)) { + if (!find_cu_die(dloc->di, pc, &cu_die)) { pr_debug("cannot find CU for address %lx\n", pc); ann_data_stat.no_cuinfo++; return -1; @@ -280,18 +287,19 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, offset = loc->offset; if (reg == DWARF_REG_PC) { - if (die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) { + if (die_find_variable_by_addr(&cu_die, pc, dloc->var_addr, + &var_die, &offset)) { ret = check_variable(&var_die, type_die, offset, /*is_pointer=*/false); - loc->offset = offset; + dloc->type_offset = offset; goto out; } - if (var_name && die_find_variable_at(&cu_die, var_name, pc, - &var_die)) { - ret = check_variable(&var_die, type_die, 0, + if (dloc->var_name && + die_find_variable_at(&cu_die, dloc->var_name, pc, &var_die)) { + ret = check_variable(&var_die, type_die, dloc->type_offset, /*is_pointer=*/false); - /* loc->offset will be updated by the caller */ + /* dloc->type_offset was updated by the caller */ goto out; } } @@ -308,10 +316,11 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, dwarf_formblock(&attr, &block) == 0 && block.length == 1) { switch (*block.data) { case DW_OP_reg0 ... DW_OP_reg31: - fbreg = *block.data - DW_OP_reg0; + fbreg = dloc->fbreg = *block.data - DW_OP_reg0; break; case DW_OP_call_frame_cfa: - if (die_get_cfa(di->dbg, pc, &fbreg, + dloc->fb_cfa = true; + if (die_get_cfa(dloc->di->dbg, pc, &fbreg, &fb_offset) < 0) fbreg = -1; break; @@ -329,7 +338,7 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, /* Search from the inner-most scope to the outer */ for (i = nr_scopes - 1; i >= 0; i--) { if (reg == DWARF_REG_PC) { - if (!die_find_variable_by_addr(&scopes[i], pc, addr, + if (!die_find_variable_by_addr(&scopes[i], pc, dloc->var_addr, &var_die, &offset)) continue; } else { @@ -342,7 +351,7 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, /* Found a variable, see if it's correct */ ret = check_variable(&var_die, type_die, offset, reg != DWARF_REG_PC && !is_fbreg); - loc->offset = offset; + dloc->type_offset = offset; goto out; } @@ -361,50 +370,46 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr, /** * find_data_type - Return a data type at the location - * @ms: map and symbol at the location - * @ip: instruction address of the memory access - * @loc: instruction operand location - * @addr: data address of the memory access - * @var_name: global variable name + * @dloc: data location * * This functions searches the debug information of the binary to get the data - * type it accesses. The exact location is expressed by (@ip, reg, offset) - * for pointer variables or (@ip, @addr) for global variables. Note that global - * variables might update the @loc->offset after finding the start of the variable. - * If it cannot find a global variable by address, it tried to fine a declaration - * of the variable using @var_name. In that case, @loc->offset won't be updated. + * type it accesses. The exact location is expressed by (ip, reg, offset) + * for pointer variables or (ip, addr) for global variables. Note that global + * variables might update the @dloc->type_offset after finding the start of the + * variable. If it cannot find a global variable by address, it tried to find + * a declaration of the variable using var_name. In that case, @dloc->offset + * won't be updated. * * It return %NULL if not found. */ -struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, - struct annotated_op_loc *loc, u64 addr, - const char *var_name) +struct annotated_data_type *find_data_type(struct data_loc_info *dloc) { struct annotated_data_type *result = NULL; - struct dso *dso = ms->map->dso; - struct debuginfo *di; + struct dso *dso = dloc->ms->map->dso; Dwarf_Die type_die; - u64 pc; - di = debuginfo__new(dso->long_name); - if (di == NULL) { + dloc->di = debuginfo__new(dso->long_name); + if (dloc->di == NULL) { pr_debug("cannot get the debug info\n"); return NULL; } /* - * IP is a relative instruction address from the start of the map, as - * it can be randomized/relocated, it needs to translate to PC which is - * a file address for DWARF processing. + * The type offset is the same as instruction offset by default. + * But when finding a global variable, the offset won't be valid. */ - pc = map__rip_2objdump(ms->map, ip); - if (find_data_type_die(di, pc, addr, var_name, loc, &type_die) < 0) + if (dloc->var_name == NULL) + dloc->type_offset = dloc->op->offset; + + dloc->fbreg = -1; + + if (find_data_type_die(dloc, &type_die) < 0) goto out; result = dso__findnew_data_type(dso, &type_die); out: - debuginfo__delete(di); + debuginfo__delete(dloc->di); return result; } diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index 1b0db8e8c40e..ad6493ea2c8e 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -8,6 +8,7 @@ #include struct annotated_op_loc; +struct debuginfo; struct evsel; struct map_symbol; @@ -72,6 +73,35 @@ struct annotated_data_type { extern struct annotated_data_type unknown_type; extern struct annotated_data_type stackop_type; +/** + * struct data_loc_info - Data location information + * @ms: Map and Symbol info + * @ip: Instruction address + * @var_addr: Data address (for global variables) + * @var_name: Variable name (for global variables) + * @op: Instruction operand location (regs and offset) + * @di: Debug info + * @fbreg: Frame base register + * @fb_cfa: Whether the frame needs to check CFA + * @type_offset: Final offset in the type + */ +struct data_loc_info { + /* These are input field, should be filled by caller */ + struct map_symbol *ms; + u64 ip; + u64 var_addr; + const char *var_name; + struct annotated_op_loc *op; + + /* These are used internally */ + struct debuginfo *di; + int fbreg; + bool fb_cfa; + + /* This is for the result */ + int type_offset; +}; + /** * struct annotated_data_stat - Debug statistics * @total: Total number of entry @@ -106,9 +136,7 @@ extern struct annotated_data_stat ann_data_stat; #ifdef HAVE_DWARF_SUPPORT /* Returns data type at the location (ip, reg, offset) */ -struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip, - struct annotated_op_loc *loc, u64 addr, - const char *var_name); +struct annotated_data_type *find_data_type(struct data_loc_info *dloc); /* Update type access histogram at the given offset */ int annotated_data_type__update_samples(struct annotated_data_type *adt, @@ -121,9 +149,7 @@ void annotated_data_type__tree_delete(struct rb_root *root); #else /* HAVE_DWARF_SUPPORT */ static inline struct annotated_data_type * -find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused, - struct annotated_op_loc *loc __maybe_unused, - u64 addr __maybe_unused, const char *var_name __maybe_unused) +find_data_type(struct data_loc_info *dloc __maybe_unused) { return NULL; } diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index fe0074bb98f0..1cf55f903ee4 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3744,9 +3744,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) struct annotated_op_loc *op_loc; struct annotated_data_type *mem_type; struct annotated_item_stat *istat; - u64 ip = he->ip, addr = 0; - const char *var_name = NULL; - int var_offset; + u64 ip = he->ip; int i; ann_data_stat.total++; @@ -3794,51 +3792,53 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) } for_each_insn_op_loc(&loc, i, op_loc) { + struct data_loc_info dloc = { + .ms = ms, + /* Recalculate IP for LOCK prefix or insn fusion */ + .ip = ms->sym->start + dl->al.offset, + .op = op_loc, + }; + if (!op_loc->mem_ref) continue; /* Recalculate IP because of LOCK prefix or insn fusion */ ip = ms->sym->start + dl->al.offset; - var_offset = op_loc->offset; - /* PC-relative addressing */ if (op_loc->reg1 == DWARF_REG_PC) { struct addr_location al; struct symbol *var; u64 map_addr; - addr = annotate_calc_pcrel(ms, ip, op_loc->offset, dl); + dloc.var_addr = annotate_calc_pcrel(ms, ip, op_loc->offset, dl); /* Kernel symbols might be relocated */ - map_addr = addr + map__reloc(ms->map); + map_addr = dloc.var_addr + map__reloc(ms->map); addr_location__init(&al); var = thread__find_symbol_fb(he->thread, he->cpumode, map_addr, &al); if (var) { - var_name = var->name; + dloc.var_name = var->name; /* Calculate type offset from the start of variable */ - var_offset = map_addr - map__unmap_ip(al.map, var->start); + dloc.type_offset = map_addr - map__unmap_ip(al.map, var->start); } addr_location__exit(&al); } - mem_type = find_data_type(ms, ip, op_loc, addr, var_name); + mem_type = find_data_type(&dloc); if (mem_type) istat->good++; else istat->bad++; - if (mem_type && var_name) - op_loc->offset = var_offset; - if (symbol_conf.annotate_data_sample) { annotated_data_type__update_samples(mem_type, evsel, - op_loc->offset, + dloc.type_offset, he->stat.nr_events, he->stat.period); } - he->mem_type_off = op_loc->offset; + he->mem_type_off = dloc.type_offset; return mem_type; } From patchwork Thu Oct 12 03:51:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418243 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3FCAD17E6; Thu, 12 Oct 2023 03:52:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JTNPmJY1" Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 419F7170F; Wed, 11 Oct 2023 20:52:04 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1c9b7c234a7so4824385ad.3; Wed, 11 Oct 2023 20:52:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082723; x=1697687523; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=luk3YMKHP4GBeY8OhFINcROss2mB/uL2Fi/9FHog88E=; b=JTNPmJY17mvmWeekp6EbtnHWVBVsqKijrmriYRPO7H93g6CP46HdEsotwkxTaOTgn4 8nOZ2FEitIOagqfteLV3GG3/th0zm2ZyPPobFJ79tW8vhnkcNkAevkcAkF7wqWd9lUdH fzC9Q1PwAek+3t5Av4A8sjUjkeDBOIBhzndgSMRR2wPL0NUnzgwiWZyDA4x/bejWo4qV 2U/q/CineA/MFqNjLmOR7IUbySKWjGw412JeLsQDlYV7UWtWopeyHFFrInELQDPgwtYV fvIFdHSbj4ZZ9ruNjvumN0R6z4/XZJ4L2eyrWB4rk+KdJ/zlJspUPKTny4pTb3w7F3Me Rwzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082723; x=1697687523; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=luk3YMKHP4GBeY8OhFINcROss2mB/uL2Fi/9FHog88E=; b=voHCJ+lQZUmZsu/G5ijsbyZHBq3zbCKg2pBRcWtxN/BWeKLQxJFTdOjwJHZC1TcNOt do45x/kz8lJFZWeGFE6KcY9yEEpLZg5Md4J8zSdiUX+EYJkpCIMzG+vaOvk1AFbHoVMO ev54GT4thKlcMgF9mPLcmeDSdQoHUT62y3wXjMJkYVurwD0orzMpw+vKD02fVozM5ekE Hm09vHNbBI9Vv3B8Iebu//xoAFeB3Cw47ylu+9vqXuwgvxK4Orz5W947gFDBD1vZ83bp QZPRuqti8WRBMGJJyV79sM01US/eSOUX3Y6+d58eXkhJPO+rQse9HzJQNWjvRoGyonrN rGXA== X-Gm-Message-State: AOJu0YxGEeuVfUCacbpICNKu7ixvg4SMhOg7Qvd5BUassmo5MjPaWGZn KSZgOGT4VsSH6529BlHKXPfyy59nYO0= X-Google-Smtp-Source: AGHT+IEGtcf7HXRETg051KX40JXX4Za2TWn2bJI9mhVghyM9tjR/pC/6W+Oa23Iy4nk6K9fou4X08Q== X-Received: by 2002:a17:903:228f:b0:1c7:1fbc:b9e7 with SMTP id b15-20020a170903228f00b001c71fbcb9e7mr27350552plh.43.1697082723294; Wed, 11 Oct 2023 20:52:03 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:02 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 37/48] perf map: Add map__objdump_2rip() Date: Wed, 11 Oct 2023 20:51:00 -0700 Message-ID: <20231012035111.676789-38-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Sometimes we want to convert an address in objdump output to map-relative address to match with a sample data. Let's add map__objdump_2rip() for that. Cc: Adrian Hunter Signed-off-by: Namhyung Kim --- tools/perf/util/map.c | 20 ++++++++++++++++++++ tools/perf/util/map.h | 3 +++ 2 files changed, 23 insertions(+) diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c index f64b83004421..f25cf664c898 100644 --- a/tools/perf/util/map.c +++ b/tools/perf/util/map.c @@ -595,6 +595,26 @@ u64 map__objdump_2mem(struct map *map, u64 ip) return ip + map__reloc(map); } +u64 map__objdump_2rip(struct map *map, u64 ip) +{ + const struct dso *dso = map__dso(map); + + if (!dso->adjust_symbols) + return ip; + + if (dso->rel) + return ip + map__pgoff(map); + + /* + * kernel modules also have DSO_TYPE_USER in dso->kernel, + * but all kernel modules are ET_REL, so won't get here. + */ + if (dso->kernel == DSO_SPACE__USER) + return ip - dso->text_offset; + + return map__map_ip(map, ip + map__reloc(map)); +} + bool map__contains_symbol(const struct map *map, const struct symbol *sym) { u64 ip = map__unmap_ip(map, sym->start); diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h index 1b53d53adc86..b7bcf0aa3b67 100644 --- a/tools/perf/util/map.h +++ b/tools/perf/util/map.h @@ -129,6 +129,9 @@ u64 map__rip_2objdump(struct map *map, u64 rip); /* objdump address -> memory address */ u64 map__objdump_2mem(struct map *map, u64 ip); +/* objdump address -> rip */ +u64 map__objdump_2rip(struct map *map, u64 ip); + struct symbol; struct thread; From patchwork Thu Oct 12 03:51:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418244 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 270B0186F; Thu, 12 Oct 2023 03:52:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TvB1WPls" Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4ACA3171A; Wed, 11 Oct 2023 20:52:05 -0700 (PDT) Received: by mail-pg1-x52c.google.com with SMTP id 41be03b00d2f7-58d261807e8so383513a12.2; Wed, 11 Oct 2023 20:52:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082724; x=1697687524; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=NoNrgrTYCFhmQWw/dMn/hk+rVy7/lvgQA+Rba8TK9Aw=; b=TvB1WPlswUHoMWa3lXsVY8CplYmUfqoTf9Mk8bYcE+NB2hicUFuELlTX68nYerrv20 8G8gXCigLZHo3+idPiipGEYjU8vRkWBNj7WHpymiJOE4t3JeR0c/GhYrFD8kXZQXX33w kUxn4RQj2qlKp5VaEGyCoCJQuH6YjIsiU3U1Ss5PKTW4d+pLY+yHzcQRqgB9W+PT/7Le LAUkymTLkiSlVLQZTKCDfZnmosIHo+DJYliJp87NAOgr/Z8Vc+mxfsSaBzh6elOr1oHd 2b+U9MPemW7/R9fgK50hY3Spa0BLJ4Nk9lBh3A0B/ttLrPbZ36fUmwbyuHXOfs4maZ3K jmYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082724; x=1697687524; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=NoNrgrTYCFhmQWw/dMn/hk+rVy7/lvgQA+Rba8TK9Aw=; b=LIgIkCK4Ss918AmS0KG0Vu7j0FFw/cjuTQhty108nCV2s07odg1AE1CSGLtsHn5IPY IlUkMMea1XKBhlJJDrV9Irsc6rkiYH4ySdHy39OLkXQObEDXzL297CHxIKBPk+hQ6AGo Cn1HAKvv0jbyhs2AJZGyuSh8ot9Y17D9O3E3tEmyt3y2CR1VqvBaTC0rY7qsBuXj999c nzPFxtgpdMUnHaKjpo8Y+Ba5cJWFW6MSDwhjDZqLsDKBvnX118DNeIHawI/DTyo0pqCo xEEX10E95CWEFAk/vakHTlzI7WGT3JqekdHFn9mu9wDnd8eGroCebD/vP20oS3G71oQZ B7Cg== X-Gm-Message-State: AOJu0YzPgY1MpZXgO2wgFd+VpC01oUf7WcYWwzUykcMwku8/fnrnbA+P z0V+gnP6i0oVd3bRiYH/Qok= X-Google-Smtp-Source: AGHT+IHp+5v8TvQ4w+OkmVq7Q10RkNSsylihPWkJ6Iz21p7k40Ucn4vL3rQcv9YJjmHjq+QJ3tBu+g== X-Received: by 2002:a17:903:184:b0:1c8:8f61:967b with SMTP id z4-20020a170903018400b001c88f61967bmr15783049plg.3.1697082724523; Wed, 11 Oct 2023 20:52:04 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:04 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 38/48] perf annotate: Add annotate_get_basic_blocks() Date: Wed, 11 Oct 2023 20:51:01 -0700 Message-ID: <20231012035111.676789-39-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The annotate_get_basic_blocks() is to find a list of basic blocks from the source instruction to the destination instruction in a function. It'll be used to find variables in a scope. Use BFS (Breadth First Search) to find a shortest path to carry the variable/register state minimally. Also change find_disasm_line() to be used in annotate_get_basic_blocks() and add 'allow_update' argument to control if it can update the IP. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate.c | 219 ++++++++++++++++++++++++++++++++++++- tools/perf/util/annotate.h | 16 +++ 2 files changed, 232 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 1cf55f903ee4..8384bc37831c 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3642,7 +3642,8 @@ static void symbol__ensure_annotate(struct map_symbol *ms, struct evsel *evsel) } } -static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip) +static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip, + bool allow_update) { struct disasm_line *dl; struct annotation *notes; @@ -3655,7 +3656,8 @@ static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip) * llvm-objdump places "lock" in a separate line and * in that case, we want to get the next line. */ - if (!strcmp(dl->ins.name, "lock") && *dl->ops.raw == '\0') { + if (!strcmp(dl->ins.name, "lock") && + *dl->ops.raw == '\0' && allow_update) { ip++; continue; } @@ -3766,7 +3768,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) * Get a disasm to extract the location from the insn. * This is too slow... */ - dl = find_disasm_line(ms->sym, ip); + dl = find_disasm_line(ms->sym, ip, /*allow_update=*/true); if (dl == NULL) { ann_data_stat.no_insn++; return NULL; @@ -3860,3 +3862,214 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) istat->bad++; return NULL; } + +/* Basic block traversal (BFS) data structure */ +struct basic_block_data { + struct list_head queue; + struct list_head visited; +}; + +/* + * During the traversal, it needs to know the parent block where the current + * block block started from. Note that single basic block can be parent of + * two child basic blocks (in case of condition jump). + */ +struct basic_block_link { + struct list_head node; + struct basic_block_link *parent; + struct annotated_basic_block *bb; +}; + +/* Check any of basic block in the list already has the offset */ +static bool basic_block_has_offset(struct list_head *head, s64 offset) +{ + struct basic_block_link *link; + + list_for_each_entry(link, head, node) { + s64 begin_offset = link->bb->begin->al.offset; + s64 end_offset = link->bb->end->al.offset; + + if (begin_offset <= offset && offset <= end_offset) + return true; + } + return false; +} + +static bool is_new_basic_block(struct basic_block_data *bb_data, + struct disasm_line *dl) +{ + s64 offset = dl->al.offset; + + if (basic_block_has_offset(&bb_data->visited, offset)) + return false; + if (basic_block_has_offset(&bb_data->queue, offset)) + return false; + return true; +} + +/* Add a basic block starting from dl and link it to the parent */ +static int add_basic_block(struct basic_block_data *bb_data, + struct basic_block_link *parent, + struct disasm_line *dl) +{ + struct annotated_basic_block *bb; + struct basic_block_link *link; + + if (dl == NULL) + return -1; + + if (!is_new_basic_block(bb_data, dl)) + return 0; + + bb = zalloc(sizeof(*bb)); + if (bb == NULL) + return -1; + + bb->begin = dl; + bb->end = dl; + INIT_LIST_HEAD(&bb->list); + + link = malloc(sizeof(*link)); + if (link == NULL) { + free(bb); + return -1; + } + + link->bb = bb; + link->parent = parent; + list_add_tail(&link->node, &bb_data->queue); + return 0; +} + +/* Returns true when it finds the target in the current basic block */ +static bool process_basic_block(struct basic_block_data *bb_data, + struct basic_block_link *link, + struct symbol *sym, u64 target) +{ + struct disasm_line *dl, *next_dl, *last_dl; + struct annotation *notes = symbol__annotation(sym); + bool found = false; + + dl = link->bb->begin; + /* Check if it's already visited */ + if (basic_block_has_offset(&bb_data->visited, dl->al.offset)) + return false; + + last_dl = list_last_entry(¬es->src->source, + struct disasm_line, al.node); + + list_for_each_entry_from(dl, ¬es->src->source, al.node) { + /* Found the target instruction */ + if (sym->start + dl->al.offset == target) { + found = true; + break; + } + /* End of the function, finish the block */ + if (dl == last_dl) + break; + /* 'return' instruction finishes the block */ + if (dl->ins.ops == &ret_ops) + break; + /* normal instructions are part of the basic block */ + if (dl->ins.ops != &jump_ops) + continue; + /* jump to a different function, tail call or return */ + if (dl->ops.target.outside) + break; + /* jump instruction creates new basic block(s) */ + next_dl = find_disasm_line(sym, sym->start + dl->ops.target.offset, + /*allow_update=*/false); + add_basic_block(bb_data, link, next_dl); + + /* + * FIXME: determine conditional jumps properly. + * Conditional jumps create another basic block with the + * next disasm line. + */ + if (!strstr(dl->ins.name, "jmp")) { + next_dl = list_next_entry(dl, al.node); + add_basic_block(bb_data, link, next_dl); + } + break; + + } + link->bb->end = dl; + return found; +} + +/* + * It founds a target basic block, build a proper linked list of basic blocks + * by following the link recursively. + */ +static void link_found_basic_blocks(struct basic_block_link *link, + struct list_head *head) +{ + while (link) { + struct basic_block_link *parent = link->parent; + + list_move(&link->bb->list, head); + list_del(&link->node); + free(link); + + link = parent; + } +} + +static void delete_basic_blocks(struct basic_block_data *bb_data) +{ + struct basic_block_link *link, *tmp; + + list_for_each_entry_safe(link, tmp, &bb_data->queue, node) { + list_del(&link->node); + free(link->bb); + free(link); + } + + list_for_each_entry_safe(link, tmp, &bb_data->visited, node) { + list_del(&link->node); + free(link->bb); + free(link); + } +} + +/** + * annotate_get_basic_blocks - Get basic blocks for given address range + * @sym: symbol to annotate + * @src: source address + * @dst: destination address + * @head: list head to save basic blocks + * + * This function traverses disasm_lines from @src to @dst and save them in a + * list of annotated_basic_block to @head. It uses BFS to find the shortest + * path between two. The basic_block_link is to maintain parent links so + * that it can build a list of blocks from the start. + */ +int annotate_get_basic_blocks(struct symbol *sym, s64 src, s64 dst, + struct list_head *head) +{ + struct basic_block_data bb_data = { + .queue = LIST_HEAD_INIT(bb_data.queue), + .visited = LIST_HEAD_INIT(bb_data.visited), + }; + struct basic_block_link *link; + struct disasm_line *dl; + int ret = -1; + + dl = find_disasm_line(sym, src, /*allow_update=*/false); + if (add_basic_block(&bb_data, /*parent=*/NULL, dl) < 0) + return -1; + + /* Find shortest path from src to dst using BFS */ + while (!list_empty(&bb_data.queue)) { + link = list_first_entry(&bb_data.queue, struct basic_block_link, node); + + if (process_basic_block(&bb_data, link, sym, dst)) { + link_found_basic_blocks(link, head); + ret = 0; + break; + } + list_move(&link->node, &bb_data.visited); + } + delete_basic_blocks(&bb_data); + return ret; +} diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 99c8d30a2fa7..c2cc9baf08be 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -493,4 +493,20 @@ extern struct list_head ann_insn_stat; u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset, struct disasm_line *dl); +/** + * struct annotated_basic_block - Basic block of instructions + * @list: List node + * @begin: start instruction in the block + * @end: end instruction in the block + */ +struct annotated_basic_block { + struct list_head list; + struct disasm_line *begin; + struct disasm_line *end; +}; + +/* Get a list of basic blocks from src to dst addresses */ +int annotate_get_basic_blocks(struct symbol *sym, s64 src, s64 dst, + struct list_head *head); + #endif /* __PERF_ANNOTATE_H */ From patchwork Thu Oct 12 03:51:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418246 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F4911378; Thu, 12 Oct 2023 03:52:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jT4qk5U+" Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 205E41722; Wed, 11 Oct 2023 20:52:06 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1c735473d1aso4399105ad.1; Wed, 11 Oct 2023 20:52:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082726; x=1697687526; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=mLaKLGWSqUFuR6JldirEmidqSxeEteg7NHTV+cCNTLU=; b=jT4qk5U+udJI7tGJANiXTTxyV6zNByGvZ4ixj4F7DJOHfix8xRJtU2Yj0XGFbfTXOi OXo6sAMCgDi6i8pSmjmXif5dgA0ahVbrheG10wBChUF28ipOPknda5usB+1lqgONc719 sxHYF93P4vtCxnMjj7RfW0ES9N6J0ZJj4XGOmDhrrSUKZrtMVWzARaJ/NRWzEuTGGEzZ wpCp494QvD/4xORlt5KpQmY/f6yXF8WGgkhy6hFobG0wWOvZatmwm6bM1uvrZGziUQ26 hae664zpJlKRZb3pGiZqxJi4sUyT4GHHUVzYehm5doR70XJrh5r8QchtPx1xcgYwN2Kz dQIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082726; x=1697687526; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=mLaKLGWSqUFuR6JldirEmidqSxeEteg7NHTV+cCNTLU=; b=neu83P1LXaeMKzwC9cR8rCKSv2F8xRmxWvbZmDiDauQaGCLN6SD3KrDV+NtO5xCRtf aUHIhUpl8nzZyiyHC1Hmn/MKChE6f+Hpm0LuF3K7lB9waoAlH0Yjlve4kz0oXhqk2swN OycMSE366sVSevFG/ceDCKh9O60r1dlIgvLpCJjYNQGEw5n8ALD9hamRSw4yZzAatr0A LXbRpC+Hmcf0jvoFHxbHwmN/BQjd3mQfCioZaSsg+MKxXD2IClHogr1S4ERbPD3aiRXl NsFnKGAD+PIAQB1j7xfs5wG6zAl2mAhEFKB9r8O9lv7iF+P6JpR3PFbT3773U9q3pj7o nmgw== X-Gm-Message-State: AOJu0YyQgHGUTr7bErPQBygb8/loWPKXSXF+mwx9lwazsjOecblnab0N YjTswXi8OO5iim16kOJZuBM= X-Google-Smtp-Source: AGHT+IEBHcceyfL5BN3w/HlZUuDeOYJc3lMOzNXoXFfYjdvl7FwJAxPoxquIRRRn/JdbKrn6X8LlUQ== X-Received: by 2002:a17:902:ec85:b0:1c4:749e:e725 with SMTP id x5-20020a170902ec8500b001c4749ee725mr23136550plg.0.1697082725805; Wed, 11 Oct 2023 20:52:05 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:05 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 39/48] perf annotate-data: Maintain variable type info Date: Wed, 11 Oct 2023 20:51:02 -0700 Message-ID: <20231012035111.676789-40-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net As it collected basic block and variable information in each scope, it now can build a state table to find matching variable at the location. The struct type_state is to keep the type info saved in each register and stack slot. The update_var_state() updates the table when it finds variables in the current address. It expects die_collect_vars() filled a list of variables with type info and starting address. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 155 ++++++++++++++++++++++++++++++++ tools/perf/util/annotate-data.h | 29 ++++++ tools/perf/util/dwarf-aux.c | 4 + 3 files changed, 188 insertions(+) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 90793cbb6aa0..a88d2cdafa08 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -40,6 +40,57 @@ struct annotated_data_type stackop_type = { /* Data type collection debug statistics */ struct annotated_data_stat ann_data_stat; +/* Type information in a register, valid when ok is true */ +struct type_state_reg { + Dwarf_Die type; + bool ok; + bool scratch; +}; + +/* Type information in a stack location, dynamically allocated */ +struct type_state_stack { + struct list_head list; + Dwarf_Die type; + int offset; + int size; + bool compound; +}; + +/* FIXME: This should be arch-dependent */ +#define TYPE_STATE_MAX_REGS 16 + +/* + * State table to maintain type info in each register and stack location. + * It'll be updated when new variable is allocated or type info is moved + * to a new location (register or stack). As it'd be used with the + * shortest path of basic blocks, it only maintains a single table. + */ +struct type_state { + struct type_state_reg regs[TYPE_STATE_MAX_REGS]; + struct list_head stack_vars; +}; + +static bool has_reg_type(struct type_state *state, int reg) +{ + return (unsigned)reg < ARRAY_SIZE(state->regs); +} + +void init_type_state(struct type_state *state, struct arch *arch __maybe_unused) +{ + memset(state, 0, sizeof(*state)); + INIT_LIST_HEAD(&state->stack_vars); +} + +void exit_type_state(struct type_state *state) +{ + struct type_state_stack *stack, *tmp; + + list_for_each_entry_safe(stack, tmp, &state->stack_vars, list) { + list_del(&stack->list); + free(stack); + } +} + /* * Compare type name and size to maintain them in a tree. * I'm not sure if DWARF would have information of a single type in many @@ -255,6 +306,110 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset, return 0; } +static struct type_state_stack *find_stack_state(struct type_state *state, + int offset) +{ + struct type_state_stack *stack; + + list_for_each_entry(stack, &state->stack_vars, list) { + if (offset == stack->offset) + return stack; + + if (stack->compound && stack->offset < offset && + offset < stack->offset + stack->size) + return stack; + } + return NULL; +} + +static void set_stack_state(struct type_state_stack *stack, int offset, + Dwarf_Die *type_die) +{ + int tag; + Dwarf_Word size; + + if (dwarf_aggregate_size(type_die, &size) < 0) + size = 0; + + tag = dwarf_tag(type_die); + + stack->type = *type_die; + stack->size = size; + stack->offset = offset; + + switch (tag) { + case DW_TAG_structure_type: + case DW_TAG_union_type: + stack->compound = true; + break; + default: + stack->compound = false; + break; + } +} + +static struct type_state_stack *findnew_stack_state(struct type_state *state, + int offset, Dwarf_Die *type_die) +{ + struct type_state_stack *stack = find_stack_state(state, offset); + + if (stack) { + set_stack_state(stack, offset, type_die); + return stack; + } + + stack = malloc(sizeof(*stack)); + if (stack) { + set_stack_state(stack, offset, type_die); + list_add(&stack->list, &state->stack_vars); + } + return stack; +} + +/** + * update_var_state - Update type state using given variables + * @state: type state table + * @dloc: data location info + * @addr: instruction address to update + * @var_types: list of variables with type info + * + * This function fills the @state table using @var_types info. Each variable + * is used only at the given location and updates an entry in the table. + */ +void update_var_state(struct type_state *state, struct data_loc_info *dloc, + u64 addr, struct die_var_type *var_types) +{ + Dwarf_Die mem_die; + struct die_var_type *var; + int fbreg = dloc->fbreg; + int fb_offset = 0; + + if (dloc->fb_cfa) { + if (die_get_cfa(dloc->di->dbg, addr, &fbreg, &fb_offset) < 0) + fbreg = -1; + } + + for (var = var_types; var != NULL; var = var->next) { + if (var->addr != addr) + continue; + /* Get the type DIE using the offset */ + if (!dwarf_offdie(dloc->di->dbg, var->die_off, &mem_die)) + continue; + + if (var->reg == DWARF_REG_FB) { + findnew_stack_state(state, var->offset, &mem_die); + } else if (var->reg == fbreg) { + findnew_stack_state(state, var->offset - fb_offset, &mem_die); + } else if (has_reg_type(state, var->reg)) { + struct type_state_reg *reg; + + reg = &state->regs[var->reg]; + reg->type = mem_die; + reg->ok = true; + } + } +} + /* The result will be saved in @type_die */ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die) { diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index ad6493ea2c8e..7fbb9eb2e96f 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -8,9 +8,12 @@ #include struct annotated_op_loc; +struct arch; struct debuginfo; +struct die_var_type; struct evsel; struct map_symbol; +struct type_state; /** * struct annotated_member - Type of member field @@ -146,6 +149,16 @@ int annotated_data_type__update_samples(struct annotated_data_type *adt, /* Release all data type information in the tree */ void annotated_data_type__tree_delete(struct rb_root *root); +/* Initialize type state table */ +void init_type_state(struct type_state *state, struct arch *arch); + +/* Destroy type state table */ +void exit_type_state(struct type_state *state); + +/* Update type state table using variables */ +void update_var_state(struct type_state *state, struct data_loc_info *dloc, + u64 addr, struct die_var_type *var_types); + #else /* HAVE_DWARF_SUPPORT */ static inline struct annotated_data_type * @@ -168,6 +181,22 @@ static inline void annotated_data_type__tree_delete(struct rb_root *root __maybe { } +static inline void init_type_state(struct type_state *state __maybe_unused, + struct arch *arch __maybe_unused) +{ +} + +static inline void exit_type_state(struct type_state *state __maybe_unused) +{ +} + +static inline void update_var_state(struct type_state *state __maybe_unused, + struct data_loc_info *dloc __maybe_unused, + u64 addr __maybe_unused, + struct die_var_type *var_types __maybe_unused) +{ +} + #endif /* HAVE_DWARF_SUPPORT */ #endif /* _PERF_ANNOTATE_DATA_H */ diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index 5ec895e0a069..923e974ad18e 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -9,6 +9,7 @@ #include #include "debug.h" #include "dwarf-aux.h" +#include "dwarf-regs.h" #include "strbuf.h" #include "string2.h" @@ -1490,6 +1491,8 @@ static int reg_from_dwarf_op(Dwarf_Op *op) case DW_OP_regx: case DW_OP_bregx: return op->number; + case DW_OP_fbreg: + return DWARF_REG_FB; default: break; } @@ -1503,6 +1506,7 @@ static int offset_from_dwarf_op(Dwarf_Op *op) case DW_OP_regx: return 0; case DW_OP_breg0 ... DW_OP_breg31: + case DW_OP_fbreg: return op->number; case DW_OP_bregx: return op->number2; From patchwork Thu Oct 12 03:51:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418248 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E1C7E15D5; Thu, 12 Oct 2023 03:52:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MoKHCygK" Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A9E8F172E; Wed, 11 Oct 2023 20:52:08 -0700 (PDT) Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-1c60cec8041so3991835ad.3; Wed, 11 Oct 2023 20:52:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082727; x=1697687527; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=1uOezb4cZDV5RInjOhQ1kglHXO9ceCmqMPAaugA2Fws=; b=MoKHCygKSwaGJUUK1FQxbHAg87Ol4tlOMFRUL8LkUZmF+HSrgXfmmFrCMOhn2U/SbH 5OCFZ13r06Bo3us26Af51KOt0s5XmR3IsFyjlIhYgbs59/VEQYm4VHpTtfp8HVikBc9/ F1vv+GIlLomWekQq9vKE4DEeHG9bsenWYtkcmz09D7vC85lO1rXnAmL2nmH6PAccraBB C3SSa2r3UXycNWN4L62u0dhu9u6VRGdZxrAByqHl+jiLYMjIWxQUAoKo2QYRe7Pnw0By xwJRmmsVmFInIcnWiXDNh2qiGBG1rbkQDrsjUeq3GafUv6K6lZRFE7C/cVAI9QkKUnVb o8Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082727; x=1697687527; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=1uOezb4cZDV5RInjOhQ1kglHXO9ceCmqMPAaugA2Fws=; b=HDwlDnSo3/DCF9wyNEk/aRetaOwk/efTwBF4bXu3LuW8dMi9wRb/iTx8T98PEk6j0z O1dYTDDwsGr8AtGTLVe1QvCJkCOmCmNOtpHCd/hd0wIVBz80oJpmRfGzrl7uKxn5FxwL kiJoPk1q6GfVtZIuMIY8pdXCV9w9B0sNbnfng6tfJLPKyX9iyOBtZFpRiMX94/UP3l4+ xPiOYUd/a94Wk4GqITzVyqvs3BZtao03Nc372lgL4Tvf9we/fci2rgmuBfUOowhRegPd Il1RARNw7nHGnFh/xcJOiumnsiEfpcrHCArgec1C0rC3xvNYAUwRepJs6na8sCttN4Op H+LQ== X-Gm-Message-State: AOJu0Yz94cOsdrq6cJQfmtTpR0/bP2Nv+MLpfSAnjYdmNL5y0jDcBnJH 7weDhqx+fANu0Mv9O4vphe4= X-Google-Smtp-Source: AGHT+IEgZY3TnNf8iEy/TGlCaUg5WcrxoTFCjxq9iR/g1TXxY/TqIFy+lgJSqTNVoKUwaLrMv9c+Zw== X-Received: by 2002:a17:902:ab82:b0:1c1:e7b2:27ad with SMTP id f2-20020a170902ab8200b001c1e7b227admr19761542plr.60.1697082727072; Wed, 11 Oct 2023 20:52:07 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:06 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 40/48] perf annotate-data: Add update_insn_state() Date: Wed, 11 Oct 2023 20:51:03 -0700 Message-ID: <20231012035111.676789-41-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net The update_insn_state() function is to update the type state table after processing each instruction. For now, it handles MOV (on x86) insn to transfer type info from the source location to the target. The location can be a register or a stack slot. Check carefully when memory reference happens and fetch the type correctly. It basically ignores write to a memory since it doesn't change the type info. One exception is writes to (new) stack slots for register spilling. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 128 +++++++++++++++++++++++++++++++- tools/perf/util/annotate-data.h | 13 ++++ tools/perf/util/annotate.c | 1 + 3 files changed, 140 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index a88d2cdafa08..e8d80b1adda9 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -44,7 +44,6 @@ struct annotated_data_stat ann_data_stat; struct type_state_reg { Dwarf_Die type; bool ok; - bool scratch; }; /* Type information in a stack location, dynamically allocated */ @@ -400,7 +399,7 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc, findnew_stack_state(state, var->offset, &mem_die); } else if (var->reg == fbreg) { findnew_stack_state(state, var->offset - fb_offset, &mem_die); - } else if (has_reg_type(state, var->reg)) { + } else if (has_reg_type(state, var->reg) && var->offset == 0) { struct type_state_reg *reg; reg = &state->regs[var->reg]; @@ -410,6 +409,131 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc, } } +/** + * update_insn_state - Update type state for an instruction + * @state: type state table + * @dloc: data location info + * @dl: disasm line for the instruction + * + * This function updates the @state table for the target operand of the + * instruction at @dl if it transfers the type like MOV on x86. Since it + * tracks the type, it won't care about the values like in arithmetic + * instructions like ADD/SUB/MUL/DIV and INC/DEC. + * + * Note that ops->reg2 is only available when both mem_ref and multi_regs + * are true. + */ +void update_insn_state(struct type_state *state, struct data_loc_info *dloc, + struct disasm_line *dl) +{ + struct annotated_insn_loc loc; + struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE]; + struct annotated_op_loc *dst = &loc.ops[INSN_OP_TARGET]; + Dwarf_Die type_die; + int fbreg = dloc->fbreg; + int fboff = 0; + + /* FIXME: remove x86 specific code and handle more instructions like LEA */ + if (!strstr(dl->ins.name, "mov")) + return; + + if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0) + return; + + if (dloc->fb_cfa) { + u64 ip = dloc->ms->sym->start + dl->al.offset; + u64 pc = map__rip_2objdump(dloc->ms->map, ip); + + if (die_get_cfa(dloc->di->dbg, pc, &fbreg, &fboff) < 0) + fbreg = -1; + } + + /* Case 1. register to register transfers */ + if (!src->mem_ref && !dst->mem_ref) { + if (!has_reg_type(state, dst->reg1)) + return; + + if (has_reg_type(state, src->reg1)) + state->regs[dst->reg1] = state->regs[src->reg1]; + else + state->regs[dst->reg1].ok = false; + } + /* Case 2. memory to register transers */ + if (src->mem_ref && !dst->mem_ref) { + int sreg = src->reg1; + + if (!has_reg_type(state, dst->reg1)) + return; + +retry: + /* Check stack variables with offset */ + if (sreg == fbreg) { + struct type_state_stack *stack; + int offset = src->offset - fboff; + + stack = find_stack_state(state, offset); + if (stack && die_get_member_type(&stack->type, + offset - stack->offset, + &type_die)) { + state->regs[dst->reg1].type = type_die; + state->regs[dst->reg1].ok = true; + } else + state->regs[dst->reg1].ok = false; + } + /* And then dereference the pointer if it has one */ + else if (has_reg_type(state, sreg) && state->regs[sreg].ok && + die_deref_ptr_type(&state->regs[sreg].type, + src->offset, &type_die)) { + state->regs[dst->reg1].type = type_die; + state->regs[dst->reg1].ok = true; + } + /* Or try another register if any */ + else if (src->multi_regs && sreg == src->reg1 && + src->reg1 != src->reg2) { + sreg = src->reg2; + goto retry; + } + /* It failed to get a type info, mark it as invalid */ + else { + state->regs[dst->reg1].ok = false; + } + } + /* Case 3. register to memory transfers */ + if (!src->mem_ref && dst->mem_ref) { + if (!has_reg_type(state, src->reg1) || + !state->regs[src->reg1].ok) + return; + + /* Check stack variables with offset */ + if (dst->reg1 == fbreg) { + struct type_state_stack *stack; + int offset = dst->offset - fboff; + + stack = find_stack_state(state, offset); + if (stack) { + /* + * The source register is likely to hold a type + * of member if it's a compound type. Do not + * update the stack variable type since we can + * get the member type later by using the + * die_get_member_type(). + */ + if (!stack->compound) + set_stack_state(stack, offset, + &state->regs[src->reg1].type); + } else { + findnew_stack_state(state, offset, + &state->regs[src->reg1].type); + } + } + /* + * Ignore other transfers since it'd set a value in a struct + * and won't change the type. + */ + } + /* Case 4. memory to memory transfers (not handled for now) */ +} + /* The result will be saved in @type_die */ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die) { diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index 7fbb9eb2e96f..ff9acf6ea808 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -11,6 +11,7 @@ struct annotated_op_loc; struct arch; struct debuginfo; struct die_var_type; +struct disasm_line; struct evsel; struct map_symbol; struct type_state; @@ -78,6 +79,7 @@ extern struct annotated_data_type stackop_type; /** * struct data_loc_info - Data location information + * @arch: architecture info * @ms: Map and Symbol info * @ip: Instruction address * @var_addr: Data address (for global variables) @@ -90,6 +92,7 @@ extern struct annotated_data_type stackop_type; */ struct data_loc_info { /* These are input field, should be filled by caller */ + struct arch *arch; struct map_symbol *ms; u64 ip; u64 var_addr; @@ -159,6 +162,10 @@ void exit_type_state(struct type_state *state); void update_var_state(struct type_state *state, struct data_loc_info *dloc, u64 addr, struct die_var_type *var_types); +/* Update type state table for an instruction */ +void update_insn_state(struct type_state *state, struct data_loc_info *dloc, + struct disasm_line *dl); + #else /* HAVE_DWARF_SUPPORT */ static inline struct annotated_data_type * @@ -197,6 +204,12 @@ static inline void update_var_state(struct type_state *state __maybe_unused, { } +static inline void update_insn_state(struct type_state *state __maybe_unused, + struct data_loc_info *dloc __maybe_unused, + struct disasm_line *dl __maybe_unused) +{ +} + #endif /* HAVE_DWARF_SUPPORT */ #endif /* _PERF_ANNOTATE_DATA_H */ diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 8384bc37831c..ab4b6a1d86fe 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3795,6 +3795,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) for_each_insn_op_loc(&loc, i, op_loc) { struct data_loc_info dloc = { + .arch = arch, .ms = ms, /* Recalculate IP for LOCK prefix or insn fusion */ .ip = ms->sym->start + dl->al.offset, From patchwork Thu Oct 12 03:51:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418256 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3AD095251; Thu, 12 Oct 2023 03:52:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dGuVDFkw" Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A26C1739; Wed, 11 Oct 2023 20:52:09 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1c9e06f058bso1579295ad.0; Wed, 11 Oct 2023 20:52:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082728; x=1697687528; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=c+cQ9CSkhhrD4qbqChHjPumcDdy/6FXJfEC4K6SPW0Y=; b=dGuVDFkw9QZFISzPwqWnHTd5H+JTKV+73FvQMMNZo6nuWJlIA+d5vKt43lp6zQqgdS w/wZOxQm2O2U+AJuF2pwuydFnV3uhzWIpCfunz7X1jChX7pniZHImqFV3Dwpsk6wF1HN ZG+TQo0yKRNBbVrhSOl4NRF5fi16Gjymz8b+lnUrM5FZ3tlkPzT4a++QQLvCy2IkWAnP OQD2cugAgXaYN11kv7bpCfyxPfLxFqXJ4WLFYIIqEl+/FKGoE1vag6ZqVl7fWmCyCxMf cSSBCwBbrKP4TOTlwrW+wrhx1flUoUVudCgUm8B0ARK2qZPFGGL6n1NReOu77i7v3Ra+ oGIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082728; x=1697687528; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=c+cQ9CSkhhrD4qbqChHjPumcDdy/6FXJfEC4K6SPW0Y=; b=gsCkVpz9fd3X6lt4faJ9xymy6aZnhUKgg1Hil2kigNsdywP5dBdZPZKRQrrFI71u0I F2wSFKEsI6f7P9yoWolyncW9kJkQxHpdGwn0W+8ovaKckB+a0aqCmifYLFTS8eVlkg4L MBa/W+S3mnXNKcU8BDi5L4CvDxzEXBnp8CIT62eYTsOTEJCFBdGXzuRTJE4NYyUqyA20 C+2xVmdhLTCgomYuaveLnnVwOdYAzJQdbDy2AokLmmD/VtkFby85B+vZOTfrnaX4cwAr PCUlU0TzovelCBoRH5rUY2K0L5tWUhtpN8ZT3d8bIXwtJGlNKa1uL5nAyjFf7YMHgqts FWDA== X-Gm-Message-State: AOJu0YyU9WLKXt7wxhzMxvLrJou7bjtkhOC9bDui4+s+nPOj40gTQ1Oo lV0GW4ZfbWaVISIx553j8js= X-Google-Smtp-Source: AGHT+IGYY2REijxPTjznI/mr2/Fqobn49pWYTc0Hv7sPSWV59ECYgQRZBNuwQvYFQGVXK1RZC6vNOA== X-Received: by 2002:a17:903:110f:b0:1c9:d358:b3d9 with SMTP id n15-20020a170903110f00b001c9d358b3d9mr4393201plh.18.1697082728435; Wed, 11 Oct 2023 20:52:08 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:08 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 41/48] perf annotate-data: Handle global variable access Date: Wed, 11 Oct 2023 20:51:04 -0700 Message-ID: <20231012035111.676789-42-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net When updating the instruction states, it also needs to handle global variable accesses. Same as it does for PC-relative addressing, it can look up the type by address (if it's defined in the same file), or by name after finding the symbol by address (for declarations). Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 45 ++++++++++++++++++++++++++++++--- tools/perf/util/annotate-data.h | 10 ++++++-- tools/perf/util/annotate.c | 45 ++++++++++++++++++++------------- tools/perf/util/annotate.h | 5 ++++ 4 files changed, 83 insertions(+), 22 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index e8d80b1adda9..37135698a5c8 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -413,6 +413,7 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc, * update_insn_state - Update type state for an instruction * @state: type state table * @dloc: data location info + * @cu_die: compile unit debug entry * @dl: disasm line for the instruction * * This function updates the @state table for the target operand of the @@ -424,7 +425,7 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc, * are true. */ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, - struct disasm_line *dl) + void *cu_die, struct disasm_line *dl) { struct annotated_insn_loc loc; struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE]; @@ -466,8 +467,46 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, return; retry: - /* Check stack variables with offset */ - if (sreg == fbreg) { + /* Check if it's a global variable */ + if (sreg == DWARF_REG_PC) { + Dwarf_Die var_die; + struct map_symbol *ms = dloc->ms; + int offset = src->offset; + u64 ip = ms->sym->start + dl->al.offset; + u64 pc, addr; + const char *var_name = NULL; + + addr = annotate_calc_pcrel(ms, ip, offset, dl); + pc = map__rip_2objdump(ms->map, ip); + + if (die_find_variable_by_addr(cu_die, pc, addr, + &var_die, &offset) && + check_variable(&var_die, &type_die, offset, + /*is_pointer=*/false) == 0 && + die_get_member_type(&type_die, offset, &type_die)) { + state->regs[dst->reg1].type = type_die; + state->regs[dst->reg1].ok = true; + return; + } + + /* Try to get the name of global variable */ + offset = src->offset; + get_global_var_info(dloc->thread, ms, ip, dl, + dloc->cpumode, &addr, + &var_name, &offset); + + if (var_name && die_find_variable_at(cu_die, var_name, + pc, &var_die) && + check_variable(&var_die, &type_die, offset, + /*is_pointer=*/false) == 0 && + die_get_member_type(&type_die, offset, &type_die)) { + state->regs[dst->reg1].type = type_die; + state->regs[dst->reg1].ok = true; + } else + state->regs[dst->reg1].ok = false; + } + /* And check stack variables with offset */ + else if (sreg == fbreg) { struct type_state_stack *stack; int offset = src->offset - fboff; diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index ff9acf6ea808..0bfef29fa52c 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -14,6 +14,7 @@ struct die_var_type; struct disasm_line; struct evsel; struct map_symbol; +struct thread; struct type_state; /** @@ -79,11 +80,13 @@ extern struct annotated_data_type stackop_type; /** * struct data_loc_info - Data location information - * @arch: architecture info + * @arch: CPU architecture info + * @thread: Thread info * @ms: Map and Symbol info * @ip: Instruction address * @var_addr: Data address (for global variables) * @var_name: Variable name (for global variables) + * @cpumode: CPU execution mode * @op: Instruction operand location (regs and offset) * @di: Debug info * @fbreg: Frame base register @@ -94,8 +97,10 @@ struct data_loc_info { /* These are input field, should be filled by caller */ struct arch *arch; struct map_symbol *ms; + struct thread *thread; u64 ip; u64 var_addr; + u8 cpumode; const char *var_name; struct annotated_op_loc *op; @@ -164,7 +169,7 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc, /* Update type state table for an instruction */ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, - struct disasm_line *dl); + void *cu_die, struct disasm_line *dl); #else /* HAVE_DWARF_SUPPORT */ @@ -206,6 +211,7 @@ static inline void update_var_state(struct type_state *state __maybe_unused, static inline void update_insn_state(struct type_state *state __maybe_unused, struct data_loc_info *dloc __maybe_unused, + void *cu_die __maybe_unused, struct disasm_line *dl __maybe_unused) { } diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index ab4b6a1d86fe..d82bfb3b519d 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3727,6 +3727,28 @@ u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset, return map__rip_2objdump(ms->map, addr); } +void get_global_var_info(struct thread *thread, struct map_symbol *ms, u64 ip, + struct disasm_line *dl, u8 cpumode, u64 *var_addr, + const char **var_name, int *poffset) +{ + struct addr_location al; + struct symbol *var; + u64 map_addr; + + *var_addr = annotate_calc_pcrel(ms, ip, *poffset, dl); + /* Kernel symbols might be relocated */ + map_addr = *var_addr + map__reloc(ms->map); + + addr_location__init(&al); + var = thread__find_symbol_fb(thread, cpumode, map_addr, &al); + if (var) { + *var_name = var->name; + /* Calculate type offset from the start of variable */ + *poffset = map_addr - map__unmap_ip(al.map, var->start); + } + addr_location__exit(&al); +} + /** * hist_entry__get_data_type - find data type for given hist entry * @he: hist entry @@ -3796,6 +3818,8 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) for_each_insn_op_loc(&loc, i, op_loc) { struct data_loc_info dloc = { .arch = arch, + .thread = he->thread, + .cpumode = he->cpumode, .ms = ms, /* Recalculate IP for LOCK prefix or insn fusion */ .ip = ms->sym->start + dl->al.offset, @@ -3810,23 +3834,10 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) /* PC-relative addressing */ if (op_loc->reg1 == DWARF_REG_PC) { - struct addr_location al; - struct symbol *var; - u64 map_addr; - - dloc.var_addr = annotate_calc_pcrel(ms, ip, op_loc->offset, dl); - /* Kernel symbols might be relocated */ - map_addr = dloc.var_addr + map__reloc(ms->map); - - addr_location__init(&al); - var = thread__find_symbol_fb(he->thread, he->cpumode, - map_addr, &al); - if (var) { - dloc.var_name = var->name; - /* Calculate type offset from the start of variable */ - dloc.type_offset = map_addr - map__unmap_ip(al.map, var->start); - } - addr_location__exit(&al); + dloc.type_offset = op_loc->offset; + get_global_var_info(he->thread, ms, ip, dl, he->cpumode, + &dloc.var_addr, &dloc.var_name, + &dloc.type_offset); } mem_type = find_data_type(&dloc); diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index c2cc9baf08be..0786528770e1 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -23,6 +23,7 @@ struct option; struct perf_sample; struct evsel; struct symbol; +struct thread; struct annotated_data_type; struct ins { @@ -493,6 +494,10 @@ extern struct list_head ann_insn_stat; u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset, struct disasm_line *dl); +void get_global_var_info(struct thread *thread, struct map_symbol *ms, u64 ip, + struct disasm_line *dl, u8 cpumode, u64 *var_addr, + const char **var_name, int *poffset); + /** * struct annotated_basic_block - Basic block of instructions * @list: List node From patchwork Thu Oct 12 03:51:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418249 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 533311C3A; Thu, 12 Oct 2023 03:52:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="IUU+5ywP" Received: from mail-pl1-x62d.google.com (mail-pl1-x62d.google.com [IPv6:2607:f8b0:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0AC2F3; Wed, 11 Oct 2023 20:52:10 -0700 (PDT) Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1c77449a6daso4675125ad.0; Wed, 11 Oct 2023 20:52:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082730; x=1697687530; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=5VcGuty2WfdJlpv/evrY/Aeps8iBoToI2rEN24hSWRo=; b=IUU+5ywP6Pz6bPSwet+E9QBRdVO63wof5vkhnFIlGS28oD4G+ojCgBFPw18YqR2/Hy 4n1g2MLsyiepo4zChhzxXFwxMUFj++s1VCiiszVKoxf/p7rQ2PXe6uYZjZfGirTotglH 0mQnQtKxJRyxmddk12Y0YNK4jgHDsnfKWCj6K+VAyKtv7Tb75fK/+ZS+tQrZ5TblfWg3 tdPKkquZCE2/2YLuqtopYUbX6JOLIcqfDmIQNFhxjtLMgfVCIA5ut+/cWH8HC8PyX4GT aIIgVpSBnGNMEpKM9hTgXbMtUtomOaJ3KfVhYisH83nS0ODhb9GJsOx9dCQgidgk0zQ6 EikQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082730; x=1697687530; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=5VcGuty2WfdJlpv/evrY/Aeps8iBoToI2rEN24hSWRo=; b=A2ujDGqJ/SaW42eIRB0PmNLMNmMsoDt7q8tzbMn1sY3kAr9a/nktEnDnOrTGz3l+H8 zDUJqfzzh8my2SeIA62MEQN2nzon7ssqneo2twwMNO+oA0RTMK10dNaDJwtdGS7h8Qny 7XIcn91pS3eSiNWOhV0tnlbqbrKpj1HF0CftHMuQlgkOnwBJeJGFf77l8q9AzX9pqicH SFXM60JSJ9XIy2ls7Expn1/I+8/L0I/R16aWbbMvt5VKE7nH3xB/Ndhh6DeIh+69RAf7 Wt+kNGjuP17hgrfmYEVoepPqyvSOqv1ozjpdlQapWjW7DJPHy1sPqCWum88BlG98aD7J KbYw== X-Gm-Message-State: AOJu0YwfXvdT6qePP1oVp9l9BWzWEglIJ6uAvLHPNOuXUItoalLdJ4En HEE58mi8jXReFSvM5qJ+RIo= X-Google-Smtp-Source: AGHT+IGT3WFYEjSubss5PGmOYJDuZC40Zfope7bEXLRaBQ2TXmX/J4Iu7M+wJgY0C3Vm3gJQi23cyg== X-Received: by 2002:a17:902:d30d:b0:1c5:ecff:1bc7 with SMTP id b13-20020a170902d30d00b001c5ecff1bc7mr23183988plc.4.1697082729721; Wed, 11 Oct 2023 20:52:09 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:09 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 42/48] perf annotate-data: Handle call instructions Date: Wed, 11 Oct 2023 20:51:05 -0700 Message-ID: <20231012035111.676789-43-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net When updating instruction states, the call instruction should play a role since it can change the register states. For simplicity, mark some registers as scratch registers (should be arch-dependent), and invalidate them all after a function call. If the function returns something, the designated register (ret_reg) will have the type info. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 45 +++++++++++++++++++++++++++++++-- 1 file changed, 43 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 37135698a5c8..f3f85cb9ac00 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -40,10 +40,14 @@ struct annotated_data_type stackop_type = { /* Data type collection debug statistics */ struct annotated_data_stat ann_data_stat; -/* Type information in a register, valid when ok is true */ +/* + * Type information in a register, valid when ok is true. + * The scratch registers are invalidated after a function call. + */ struct type_state_reg { Dwarf_Die type; bool ok; + bool scratch; }; /* Type information in a stack location, dynamically allocated */ @@ -67,6 +71,7 @@ struct type_state_stack { struct type_state { struct type_state_reg regs[TYPE_STATE_MAX_REGS]; struct list_head stack_vars; + int ret_reg; }; static bool has_reg_type(struct type_state *state, int reg) @@ -74,10 +79,23 @@ static bool has_reg_type(struct type_state *state, int reg) return (unsigned)reg < ARRAY_SIZE(state->regs); } -void init_type_state(struct type_state *state, struct arch *arch __maybe_unused) +void init_type_state(struct type_state *state, struct arch *arch) { memset(state, 0, sizeof(*state)); INIT_LIST_HEAD(&state->stack_vars); + + if (arch__is(arch, "x86")) { + state->regs[0].scratch = true; + state->regs[1].scratch = true; + state->regs[2].scratch = true; + state->regs[4].scratch = true; + state->regs[5].scratch = true; + state->regs[8].scratch = true; + state->regs[9].scratch = true; + state->regs[10].scratch = true; + state->regs[11].scratch = true; + state->ret_reg = 0; + } } void exit_type_state(struct type_state *state) @@ -434,6 +452,29 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, int fbreg = dloc->fbreg; int fboff = 0; + if (ins__is_call(&dl->ins)) { + Dwarf_Die func_die; + + /* __fentry__ will preserve all registers */ + if (dl->ops.target.sym && + !strcmp(dl->ops.target.sym->name, "__fentry__")) + return; + + /* Otherwise invalidate scratch registers after call */ + for (unsigned i = 0; i < ARRAY_SIZE(state->regs); i++) { + if (state->regs[i].scratch) + state->regs[i].ok = false; + } + + /* Update register with the return type (if any) */ + if (die_find_realfunc(cu_die, dl->ops.target.addr, &func_die) && + die_get_real_type(&func_die, &type_die)) { + state->regs[state->ret_reg].type = type_die; + state->regs[state->ret_reg].ok = true; + } + return; + } + /* FIXME: remove x86 specific code and handle more instructions like LEA */ if (!strstr(dl->ins.name, "mov")) return; From patchwork Thu Oct 12 03:51:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418251 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B7EE61C3A; Thu, 12 Oct 2023 03:52:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="H3XLs9KS" Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C7F4198B; Wed, 11 Oct 2023 20:52:12 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-68fb85afef4so455928b3a.1; Wed, 11 Oct 2023 20:52:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082731; x=1697687531; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=MfHvfZCOPkeeX0/2Z0PzUnXJQ0Ykg0RYecE8el64+KU=; b=H3XLs9KSraE69cE0ezVbHgibcYknGNKa5UJHcbWTr6FyxGlmeYoq0c3wQfm8NJnYXE wTcbQjQUMaPbGyb0p9Ez6M3GZaPWUBE/zYj6mBPXVyCNw7/mxT3R8V/dQh8tVIGIiCVG EH6ldiwq8GifUzpNO4wlLvPvviAwa+DJfmUak5Y/5Qc/J9uanGYngfnuqSa8Wst09AOI K8OHmaX9keIUU7tKd2B9N8LvVS18C5PzF/bZ7XoVBPHHjGjtaQ/uKfdQWlyC1p/dsDfi 9viK1+S8HbzMHaGoJ9twruBs7Q7rXowI3+1hTOr0OPkK3GMo7FvckqPx0Vc4CmjB55nm k2+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082731; x=1697687531; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=MfHvfZCOPkeeX0/2Z0PzUnXJQ0Ykg0RYecE8el64+KU=; b=TMhInYquiOmR+O24xrbNwW1nPjVAqn5r3krPAIWuvy2OyJwavMn+wMwGS4cmPQur91 kA8/9saTHzRPOCYESjXbqzbP/ng+4dayC79jHhYE3PQmKP208FI5AOu74nhBlfmodJGm 4aGgOJXobvjYaWxUuDvL6158W+S3Od9hi7Zf5J4OCrWmi/twWnRo/NFD3XsHyzQ7qG8p qCfaiT4QfjjNEbbMrOkuL4/qpztDVliQtNFaf+ClgD1/56Fu7N43njOqA+S8OZsEQNdQ IGm0qLXeD/NGwrZ4uLSCYWHez50hel6vuNIvHf8jopgC5HbhHzfSJ10KHPhuyp5PGWn4 s2Zg== X-Gm-Message-State: AOJu0Yxq+enBdA02n6IOoD3A2cJcN/a8xvB1P9H/AU2TKHByaLLWCAMH IVSLtqm1usZTJJMqPnCs9Fg= X-Google-Smtp-Source: AGHT+IEX0Pm/i3UHxHoLsnkXOe002/x8vNuig5KQuVGQZadCkAJjXkLphYxSegieNhTH4jOWTMH8iA== X-Received: by 2002:a05:6a20:f3b0:b0:15e:9c2f:5294 with SMTP id qr48-20020a056a20f3b000b0015e9c2f5294mr18897918pzb.56.1697082730976; Wed, 11 Oct 2023 20:52:10 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:10 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 43/48] perf annotate-data: Implement instruction tracking Date: Wed, 11 Oct 2023 20:51:06 -0700 Message-ID: <20231012035111.676789-44-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-0.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,OBFU_UNSUB_UL,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net If it failed to find a variable for the location directly, it might be due to a missing variable in the source code. For example, accessing pointer variables in a chain can result in the case like below: struct foo *foo = ...; int i = foo->bar->baz; The DWARF debug information is created for each variable so it'd have one for 'foo'. But there's no variable for 'foo->bar' and then it cannot know the type of 'bar' and 'baz'. The above source code can be compiled to the follow x86 instructions: mov 0x8(%rax), %rcx mov 0x4(%rcx), %rdx <=== PMU sample mov %rdx, -4(%rbp) Let's say 'foo' is located in the %rax and it has a pointer to struct foo. But perf sample is captured in the second instruction and there is no variable or type info for the %rcx. It'd be great if compiler could generate debug info for %rcx, but we should handle it on our side. So this patch implements the logic to iterate instructions and update the type table for each location. As it already collected a list of scopes including the target instruction, we can use it to construct the type table smartly. +---------------- scope[0] subprogram | | +-------------- scope[1] lexical_block | | | | +------------ scope[2] inlined_subroutine | | | | | | +---------- scope[3] inlined_subroutine | | | | | | | | +-------- scope[4] lexical_block | | | | | | | | | | *** target instruction ... Image the target instruction has 5 scopes, each scope will have its own variables and parameters. Then it can start with the innermost scope (4). So it'd search the shortest path from the start of scope[4] to the target address and build a list of basic blocks. Then it iterates the basic blocks with the variables in the scope and update the table. If it finds a type at the target instruction, then returns it. Otherwise, it moves to the upper scope[3]. Now it'd search the shortest path from the start of scope[3] to the start of scope[4]. Then connect it to the existing basic block list. Then it'd iterate the blocks with variables for both scopes. It can repeat this until it finds a type at the target instruction or reaches to the top scope[0]. As the basic blocks contain the shortest path, it won't worry about branches and can update the table simply. With this change, the stat now looks like below: Annotate data type stats: total 294, ok 185 (62.9%), bad 109 (37.1%) ----------------------------------------------------------- 30 : no_sym 32 : no_mem_ops 27 : no_var 13 : no_typeinfo 7 : bad_offset Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 232 ++++++++++++++++++++++++++++++++ 1 file changed, 232 insertions(+) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index f3f85cb9ac00..1992ef20f71d 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -614,6 +614,231 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, /* Case 4. memory to memory transfers (not handled for now) */ } +/* Prepend this_list to full_list, removing duplicate disasm line */ +static void prepend_basic_blocks(struct list_head *this_blocks, + struct list_head *full_blocks) +{ + struct annotated_basic_block *first_bb, *last_bb; + + last_bb = list_last_entry(this_blocks, typeof(*last_bb), list); + first_bb = list_first_entry(full_blocks, typeof(*first_bb), list); + + if (list_empty(full_blocks)) + goto out; + + if (last_bb->end != first_bb->begin) { + pr_debug("prepend basic blocks: mismatched disasm line %lx -> %lx\n", + last_bb->end->al.offset, first_bb->begin->al.offset); + goto out; + } + + /* Is the basic block have only one disasm_line? */ + if (last_bb->begin == last_bb->end) { + list_del(&last_bb->list); + free(last_bb); + goto out; + } + + last_bb->end = list_prev_entry(last_bb->end, al.node); + +out: + list_splice(this_blocks, full_blocks); +} + +static void delete_basic_blocks(struct list_head *basic_blocks) +{ + struct annotated_basic_block *bb, *tmp; + + list_for_each_entry_safe(bb, tmp, basic_blocks, list) { + list_del(&bb->list); + free(bb); + } +} + +/* Make sure all variables have a valid start address */ +static void fixup_var_address(struct die_var_type *var_types, u64 addr) +{ + while (var_types) { + /* + * Some variables have no address range meaning it's always + * available in the whole scope. Let's adjust the start + * address to the start of the scope. + */ + if (var_types->addr == 0) + var_types->addr = addr; + + var_types = var_types->next; + } +} + +static void delete_var_types(struct die_var_type *var_types) +{ + while (var_types) { + struct die_var_type *next = var_types->next; + + free(var_types); + var_types = next; + } +} + +/* It's at the target address, check if it has a matching type */ +static bool find_matching_type(struct type_state *state, + struct data_loc_info *dloc, int reg, + Dwarf_Die *type_die) +{ + Dwarf_Word size; + + if (state->regs[reg].ok) { + int tag = dwarf_tag(&state->regs[reg].type); + + /* + * Normal registers should hold a pointer (or array) to + * dereference a memory location. + */ + if (tag != DW_TAG_pointer_type && tag != DW_TAG_array_type) + return false; + + if (die_get_real_type(&state->regs[reg].type, type_die) == NULL) + return false; + + dloc->type_offset = dloc->op->offset; + + /* Get the size of the actual type */ + if (dwarf_aggregate_size(type_die, &size) < 0 || + (unsigned)dloc->type_offset >= size) + return false; + + return true; + } + + if (reg == dloc->fbreg) { + struct type_state_stack *stack; + + stack = find_stack_state(state, dloc->type_offset); + if (stack == NULL) + return false; + + *type_die = stack->type; + /* Update the type offset from the start of slot */ + dloc->type_offset -= stack->offset; + return true; + } + + if (dloc->fb_cfa) { + struct type_state_stack *stack; + u64 pc = map__rip_2objdump(dloc->ms->map, dloc->ip); + int fbreg, fboff; + + if (die_get_cfa(dloc->di->dbg, pc, &fbreg, &fboff) < 0) + fbreg = -1; + + if (reg != fbreg) + return false; + + stack = find_stack_state(state, dloc->type_offset - fboff); + if (stack == NULL) + return false; + + *type_die = stack->type; + /* Update the type offset from the start of slot */ + dloc->type_offset -= fboff + stack->offset; + return true; + } + + return false; +} + +/* Iterate instructions in basic blocks and update type table */ +static bool find_data_type_insn(struct data_loc_info *dloc, int reg, + struct list_head *basic_blocks, + struct die_var_type *var_types, + Dwarf_Die *cu_die, Dwarf_Die *type_die) +{ + struct type_state state; + struct symbol *sym = dloc->ms->sym; + struct annotation *notes = symbol__annotation(sym); + struct annotated_basic_block *bb; + bool found = false; + + init_type_state(&state, dloc->arch); + + list_for_each_entry(bb, basic_blocks, list) { + struct disasm_line *dl = bb->begin; + + list_for_each_entry_from(dl, ¬es->src->source, al.node) { + u64 this_ip = sym->start + dl->al.offset; + u64 addr = map__rip_2objdump(dloc->ms->map, this_ip); + + /* Update variable type at this address */ + update_var_state(&state, dloc, addr, var_types); + + if (this_ip == dloc->ip) { + found = find_matching_type(&state, dloc, reg, + type_die); + goto out; + } + + /* Update type table after processing the instruction */ + update_insn_state(&state, dloc, cu_die, dl); + if (dl == bb->end) + break; + } + } + +out: + exit_type_state(&state); + return found; +} + +/* + * Construct a list of basic blocks for each scope with variables and try to find + * the data type by updating a type state table through instructions. + */ +static int find_data_type_block(struct data_loc_info *dloc, int reg, + Dwarf_Die *cu_die, Dwarf_Die *scopes, + int nr_scopes, Dwarf_Die *type_die) +{ + LIST_HEAD(basic_blocks); + struct die_var_type *var_types = NULL; + u64 src_ip, dst_ip; + int ret = -1; + + dst_ip = dloc->ip; + for (int i = nr_scopes - 1; i >= 0; i--) { + Dwarf_Addr base, start, end; + LIST_HEAD(this_blocks); + + if (dwarf_ranges(&scopes[i], 0, &base, &start, &end) < 0) + break; + + src_ip = map__objdump_2rip(dloc->ms->map, start); + + /* Get basic blocks for this scope */ + if (annotate_get_basic_blocks(dloc->ms->sym, src_ip, dst_ip, + &this_blocks) < 0) + continue; + prepend_basic_blocks(&this_blocks, &basic_blocks); + + /* Get variable info for this scope and add to var_types list */ + die_collect_vars(&scopes[i], &var_types); + fixup_var_address(var_types, start); + + /* Find from start of this scope to the target instruction */ + if (find_data_type_insn(dloc, reg, &basic_blocks, var_types, + cu_die, type_die)) { + ret = 0; + break; + } + + /* Go up to the next scope and find blocks to the start */ + dst_ip = src_ip; + } + + delete_basic_blocks(&basic_blocks); + delete_var_types(var_types); + return ret; +} + /* The result will be saved in @type_die */ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die) { @@ -714,6 +939,13 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die) goto out; } + if (reg != DWARF_REG_PC) { + ret = find_data_type_block(dloc, reg, &cu_die, scopes, + nr_scopes, type_die); + if (ret == 0) + goto out; + } + if (loc->multi_regs && reg == loc->reg1 && loc->reg1 != loc->reg2) { reg = loc->reg2; goto retry; From patchwork Thu Oct 12 03:51:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418250 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD7D815D5; Thu, 12 Oct 2023 03:52:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BoI8o5xN" Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 59182102; Wed, 11 Oct 2023 20:52:13 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1c9e06f058bso1579565ad.0; Wed, 11 Oct 2023 20:52:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082732; x=1697687532; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=x0XefwurqynPVw7eemK2Zs/wgP6X8os3y1RMazTGeGI=; b=BoI8o5xN9VsXtWcWdlTCly88YOJwKx5MoWWX9JLNyY/rgyCtaOvqwA4E1h3pWytRyG PLsNaKgygG17Ny362mLzHrLLW6/gN/LlL8nOvwgmQZCa8FDqOaTETor4Uwjwcl94YbZR 4fli/hQ7OZJMC/a8qVA4Npd9DzKoEHiuYFhou3COfrshF3vgacnpPg7CtzLURjFszTFB ogaYaeCg7NS7CyQkV1s+ITUgim4KdHFJsQ9eKAeUVqObNDgCZlZz8xrtW3R27QaoTtyM IOqV3C/IBsAZ5l4q1ii3ydika/VlaI0XDEOSvMxcMy/o4WqCl7k6houguQCr33ohnz0u 6qmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082732; x=1697687532; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=x0XefwurqynPVw7eemK2Zs/wgP6X8os3y1RMazTGeGI=; b=kFGPn/QMzD23LYDjtLPFKVpy8GPIQEZD+OKJEP+4wpNql3WT/fR07G0bqsGoFPsL3S K/iib9JdiB3KvVpvR1iUVRG/CDKC2oc+/TmEiKhdrwh/rddPQFibUQH8SpBTourTc5if UaL+JoiF8TRr32ZrQJsIXH/XQO61feinqhFHXBqQgRgGw6+ZwmayqMbADWSwZT0R4ZIH sq/nOxflHBQx0InwrfVH7KMWb+19XeAAhHr3eydnj5M8HmPqfbK2fqzmfbC2xUjBAvFn tG0x8MNgs2nA0R86qLOpL+Nu8ylJTVVlvicL5bQFcJB8ww7BnTnglw1CWUQYf4BF+/oU CdNQ== X-Gm-Message-State: AOJu0YwPwv6j9U82ZfGdPY2DadjY7smemMVybRH2Xpi61CZy6LrQ1402 KZQ2xGnCpuPb7UodNlDSrdk= X-Google-Smtp-Source: AGHT+IERx4ookeIMjvNYjVl3alziknMVM+QRrVZ0bG/brZrYaUm+RaglJ/C+XXX3ThSTtCks+N2cRw== X-Received: by 2002:a17:902:e852:b0:1c5:59dc:6e93 with SMTP id t18-20020a170902e85200b001c559dc6e93mr34504123plg.3.1697082732334; Wed, 11 Oct 2023 20:52:12 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:12 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 44/48] perf annotate: Parse x86 segment register location Date: Wed, 11 Oct 2023 20:51:07 -0700 Message-ID: <20231012035111.676789-45-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Add a segment field in the struct annotated_insn_loc and save it for the segment based addressing like %gs:0x28. For simplicity it now handles %gs register only. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate.c | 21 +++++++++++++++++++-- tools/perf/util/annotate.h | 13 +++++++++++++ 2 files changed, 32 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index d82bfb3b519d..7a097f64a28a 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3513,6 +3513,12 @@ static int extract_reg_offset(struct arch *arch, const char *str, * %gs:0x18(%rbx). In that case it should skip the part. */ if (*str == arch->objdump.register_char) { + if (arch__is(arch, "x86")) { + /* FIXME: Handle other segment registers */ + if (!strncmp(str, "%gs:", 4)) + op_loc->segment = INSN_SEG_X86_GS; + } + while (*str && !isdigit(*str) && *str != arch->objdump.memory_ref_char) str++; @@ -3609,8 +3615,19 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl, op_loc->multi_regs = multi_regs; extract_reg_offset(arch, insn_str, op_loc); } else { - char *s = strdup(insn_str); + char *s; + + if (arch__is(arch, "x86")) { + /* FIXME: Handle other segment registers */ + if (!strncmp(insn_str, "%gs:", 4)) { + op_loc->segment = INSN_SEG_X86_GS; + op_loc->offset = strtol(insn_str + 4, + NULL, 0); + continue; + } + } + s = strdup(insn_str); if (s) { op_loc->reg1 = get_dwarf_regnum(s, 0); free(s); @@ -3826,7 +3843,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) .op = op_loc, }; - if (!op_loc->mem_ref) + if (!op_loc->mem_ref && op_loc->segment == INSN_SEG_NONE) continue; /* Recalculate IP because of LOCK prefix or insn fusion */ diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 0786528770e1..076b5338ade1 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -444,6 +444,7 @@ int annotate_check_args(struct annotation_options *args); * @reg1: First register in the operand * @reg2: Second register in the operand * @offset: Memory access offset in the operand + * @segment: Segment selector register * @mem_ref: Whether the operand accesses memory * @multi_regs: Whether the second register is used */ @@ -451,6 +452,7 @@ struct annotated_op_loc { int reg1; int reg2; int offset; + u8 segment; bool mem_ref; bool multi_regs; }; @@ -462,6 +464,17 @@ enum annotated_insn_ops { INSN_OP_MAX, }; +enum annotated_x86_segment { + INSN_SEG_NONE = 0, + + INSN_SEG_X86_CS, + INSN_SEG_X86_DS, + INSN_SEG_X86_ES, + INSN_SEG_X86_FS, + INSN_SEG_X86_GS, + INSN_SEG_X86_SS, +}; + /** * struct annotated_insn_loc - Location info of instruction * @ops: Array of location info for source and target operands From patchwork Thu Oct 12 03:51:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418253 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E06D5254; Thu, 12 Oct 2023 03:52:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="h5a/6m3p" Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0AF7199A; Wed, 11 Oct 2023 20:52:14 -0700 (PDT) Received: by mail-pg1-x531.google.com with SMTP id 41be03b00d2f7-5809d5fe7f7so398725a12.3; Wed, 11 Oct 2023 20:52:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082733; x=1697687533; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=n1kE8iuDTwlI9GdS4S0Z9K7JNGWfoZ1amqFutl3GXsA=; b=h5a/6m3ptIDnWrQnWunqK6YgJdALR4Rx83P6CTcH7uiJUnYE+HmptQswrc6ortrgaK dc2X1uPxUO0/flFZDWltm/RV6GdDqVb/1BTJAwO3qyj//GKhXhgc6A+o6+91w6XgCRP7 DMUuZauEi+l696AV0ifNOrwJI/0SCmjj7Sv9hSD6GKzsS6LfBrp5Qs3/bGss4PMEZHil 9fBfAiwNlwdFzeqk7TVTEJF6i/i66aSrYCTxx3nR/rwlKA9fxwEBy7WMp8obaMQS+/xb Ps0TviaIbNpU/JfhxrcdJEi7Jm4MCmu6dqb3y3ywq5ElkNCnMVoYgCO29MfCUUwYsyUq t2kQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082733; x=1697687533; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=n1kE8iuDTwlI9GdS4S0Z9K7JNGWfoZ1amqFutl3GXsA=; b=w7+0Xq36mJrZtessq/l6HzwRVFRKb3B5H9BmSLBEDTntz532N9ANZ8s0Ovq8+L6LF7 uAVMQi0fyNMNU6UAqdDAT1PhZYwgZKLMAixi7E+X5jXqohxuiwPiFHlY/bj0gAzDVXJW qU0qMXzKjGc/IYbxtSlm3dbcyV1kxkgQNNG17EcBm2UKShYcK88DoQaNrjXJC8aN4q5w pFkmDCZORL1ovc1Ap3r37cu+DgN6euidu4AAFIWW8inmkEIqRnSx5NCpGvQWjnfIU6at yQqVU+xoFuqk9wbwZInZkdUWnhwkeE1eWem+WntFXjk5U8RoQOZtk7FS7T+rW5UitEtr Xh2w== X-Gm-Message-State: AOJu0Ywa781CHAF/nItbnqJ/F2PK1fbuOEVg7BzYNLfRrZPwi654NtH/ ar2tVkj5ItCbNlDvihwj87c= X-Google-Smtp-Source: AGHT+IEJc3M01zGWFp0ka99aRQFjBrE5xPBLbFf+iPcuVB2WHyZS+rbfRjwijLAmwHsLSmGjFTzcVg== X-Received: by 2002:a05:6a20:12c1:b0:174:63a9:2ab with SMTP id v1-20020a056a2012c100b0017463a902abmr2119183pzg.45.1697082733543; Wed, 11 Oct 2023 20:52:13 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:13 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 45/48] perf annotate-data: Handle this-cpu variables in kernel Date: Wed, 11 Oct 2023 20:51:08 -0700 Message-ID: <20231012035111.676789-46-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net On x86, the kernel gets the current task using the current macro like below: #define current get_current() static __always_inline struct task_struct *get_current(void) { return this_cpu_read_stable(pcpu_hot.current_task); } So it returns the current_task field of struct pcpu_hot which is the first member. On my build, it's located at 0x32940. $ nm vmlinux | grep pcpu_hot 0000000000032940 D pcpu_hot And the current macro generates the instructions like below: mov %gs:0x32940, %rcx So the %gs segment register points to the beginning of the per-cpu region of this cpu and it points the variable with a constant. Let's update the instruction location info to have a segment register and handle %gs in kernel to look up a global variable. The new get_percpu_var_info() helper is to get information about the variable. Pretend it as a global variable by changing the register number to DWARF_REG_PC. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate.c | 31 +++++++++++++++++++++++++++++++ tools/perf/util/annotate.h | 4 ++++ 2 files changed, 35 insertions(+) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 7a097f64a28a..414ae45b7c06 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3766,6 +3766,27 @@ void get_global_var_info(struct thread *thread, struct map_symbol *ms, u64 ip, addr_location__exit(&al); } +void get_percpu_var_info(struct thread *thread, struct map_symbol *ms, + u8 cpumode, u64 var_addr, const char **var_name, + int *poffset) +{ + struct addr_location al; + struct symbol *var; + u64 map_addr; + + /* Kernel symbols might be relocated */ + map_addr = var_addr + map__reloc(ms->map); + + addr_location__init(&al); + var = thread__find_symbol_fb(thread, cpumode, map_addr, &al); + if (var) { + *var_name = var->name; + /* Calculate type offset from the start of variable */ + *poffset = map_addr - map__unmap_ip(al.map, var->start); + } + addr_location__exit(&al); +} + /** * hist_entry__get_data_type - find data type for given hist entry * @he: hist entry @@ -3857,6 +3878,16 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) &dloc.type_offset); } + /* This CPU access in kernel - pretend PC-relative addressing */ + if (op_loc->reg1 == -1 && ms->map->dso->kernel && + arch__is(arch, "x86") && op_loc->segment == INSN_SEG_X86_GS) { + dloc.var_addr = op_loc->offset; + get_percpu_var_info(he->thread, ms, he->cpumode, + dloc.var_addr, &dloc.var_name, + &dloc.type_offset); + op_loc->reg1 = DWARF_REG_PC; + } + mem_type = find_data_type(&dloc); if (mem_type) istat->good++; diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 076b5338ade1..c090cea1abdc 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -511,6 +511,10 @@ void get_global_var_info(struct thread *thread, struct map_symbol *ms, u64 ip, struct disasm_line *dl, u8 cpumode, u64 *var_addr, const char **var_name, int *poffset); +void get_percpu_var_info(struct thread *thread, struct map_symbol *ms, + u8 cpumode, u64 var_addr, const char **var_name, + int *poffset); + /** * struct annotated_basic_block - Basic block of instructions * @list: List node From patchwork Thu Oct 12 03:51:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418255 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B69792112; Thu, 12 Oct 2023 03:52:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="M9g6qVsI" Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0932219AA; Wed, 11 Oct 2023 20:52:16 -0700 (PDT) Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-692ada71d79so453265b3a.1; Wed, 11 Oct 2023 20:52:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082735; x=1697687535; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=32z5oH+I7vh+Z/UZDnI1j9besZ0XJgEASBZ50FSTvqQ=; b=M9g6qVsIRuFYeiXoviQTLK5Z8dpfmzZYQoFWn0QXpYB2Sd/ZGD9ckjZ1+AHitr877k I5qNh30jWd+Othq9kxOxEzvNnKWlMX7rCSzGcULbQLzX7Jpfub20ryxrIWHxv1lKU3Gg iD2I1gnVY4z/044xUVTYq5oI1qUevv3ogg11HmhUxK9wfkxidYGAyySPc7knaBxnPmUG cn4JP6qsrB/yU7RUvUTSAZ1DOnklMeEY++GY7WRl0pyyFAncpApGmtqyroW48JrKAKUy SNzoa0PA4DZhJ1o84QuxIeS60AyKF2vp6GhFRGFMHM4jVDzrVUr0/Ymwx9D6et1rEZyn Mnlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082735; x=1697687535; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=32z5oH+I7vh+Z/UZDnI1j9besZ0XJgEASBZ50FSTvqQ=; b=cvLxAvau8rAF7wAYlXRX+OW9I031rLY0jkCX+Vse59cTHdFukueNVrn/+ZoP23fNlY WtjP4rWwPgmzowK7nE7by7mF1K22lhGwHaxk01Oa9tkplOwK4j6pySXGExIqgEAg5w/Q gRF81i72SSiEwRBRGsIPI/41cuFfrYyTko9O0W7u+dkfPNcEHbQeeCo1HJoyZhyChcf+ L6lXaa4Daipz9jPoUXe5QWfSvT3GTN9yEw4mPP8os7JY7t2vAwDIgR8ga5CwW/6ROUPy IalIUNzBbJ8/a50YtmG3EmOJeALEHcwPrCbcDOiT9TiQqlFgBGEZmeYnROMKcgt20YkZ qlYg== X-Gm-Message-State: AOJu0YwGciQNF+AY32MKVe1pe30BciV0njLj116/ihy8V785Lei03zih poL+0z2sa+St/G4Y1PTZsLg= X-Google-Smtp-Source: AGHT+IHJq4vVq1fmAyTMkN9TzIW9VIqYrEYCUKye1tVv9QzvsN5hQLAUIb5Q+ve7KSQeoHvgFjU8JQ== X-Received: by 2002:a05:6a20:9754:b0:174:2286:81f4 with SMTP id hs20-20020a056a20975400b00174228681f4mr1951878pzc.14.1697082734885; Wed, 11 Oct 2023 20:52:14 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:14 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 46/48] perf annotate-data: Track instructions with a this-cpu variable Date: Wed, 11 Oct 2023 20:51:09 -0700 Message-ID: <20231012035111.676789-47-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Like global variables, this per-cpu variables should be tracked correctly. Factor our get_global_var_type() to handle both global and per-cpu (for this cpu) variables in the same manner. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 84 +++++++++++++++++++++++---------- 1 file changed, 60 insertions(+), 24 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 1992ef20f71d..677dc01432d3 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -427,6 +427,37 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc, } } +static bool get_global_var_type(Dwarf_Die *cu_die, struct map_symbol *ms, u64 ip, + u64 var_addr, const char *var_name, int var_offset, + Dwarf_Die *type_die) +{ + u64 pc; + int offset = var_offset; + bool is_pointer = false; + Dwarf_Die var_die; + + pc = map__rip_2objdump(ms->map, ip); + + /* Try to get the variable by address first */ + if (die_find_variable_by_addr(cu_die, pc, var_addr, &var_die, &offset) && + check_variable(&var_die, type_die, offset, is_pointer) == 0 && + die_get_member_type(type_die, offset, type_die)) + return true; + + if (var_name == NULL) + return false; + + offset = var_offset; + + /* Try to get the name of global variable */ + if (die_find_variable_at(cu_die, var_name, pc, &var_die) && + check_variable(&var_die, type_die, offset, is_pointer) == 0 && + die_get_member_type(type_die, offset, type_die)) + return true; + + return false; +} + /** * update_insn_state - Update type state for an instruction * @state: type state table @@ -490,14 +521,36 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, fbreg = -1; } - /* Case 1. register to register transfers */ + /* Case 1. register to register or segment:offset to register transfers */ if (!src->mem_ref && !dst->mem_ref) { if (!has_reg_type(state, dst->reg1)) return; if (has_reg_type(state, src->reg1)) state->regs[dst->reg1] = state->regs[src->reg1]; - else + else if (dloc->ms->map->dso->kernel && + src->segment == INSN_SEG_X86_GS) { + struct map_symbol *ms = dloc->ms; + int offset = src->offset; + u64 ip = ms->sym->start + dl->al.offset; + const char *var_name = NULL; + u64 var_addr; + + /* + * In kernel, %gs points to a per-cpu region for the + * current CPU. Access with a constant offset should + * be treated as a global variable access. + */ + var_addr = src->offset; + get_percpu_var_info(dloc->thread, ms, dloc->cpumode, + var_addr, &var_name, &offset); + + if (get_global_var_type(cu_die, ms, ip, var_addr, + var_name, offset, &type_die)) { + state->regs[dst->reg1].type = type_die; + state->regs[dst->reg1].ok = true; + } + } else state->regs[dst->reg1].ok = false; } /* Case 2. memory to register transers */ @@ -510,37 +563,20 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, retry: /* Check if it's a global variable */ if (sreg == DWARF_REG_PC) { - Dwarf_Die var_die; struct map_symbol *ms = dloc->ms; int offset = src->offset; u64 ip = ms->sym->start + dl->al.offset; - u64 pc, addr; const char *var_name = NULL; + u64 var_addr; - addr = annotate_calc_pcrel(ms, ip, offset, dl); - pc = map__rip_2objdump(ms->map, ip); - - if (die_find_variable_by_addr(cu_die, pc, addr, - &var_die, &offset) && - check_variable(&var_die, &type_die, offset, - /*is_pointer=*/false) == 0 && - die_get_member_type(&type_die, offset, &type_die)) { - state->regs[dst->reg1].type = type_die; - state->regs[dst->reg1].ok = true; - return; - } + var_addr = annotate_calc_pcrel(ms, ip, offset, dl); - /* Try to get the name of global variable */ - offset = src->offset; get_global_var_info(dloc->thread, ms, ip, dl, - dloc->cpumode, &addr, + dloc->cpumode, &var_addr, &var_name, &offset); - if (var_name && die_find_variable_at(cu_die, var_name, - pc, &var_die) && - check_variable(&var_die, &type_die, offset, - /*is_pointer=*/false) == 0 && - die_get_member_type(&type_die, offset, &type_die)) { + if (get_global_var_type(cu_die, ms, ip, var_addr, + var_name, offset, &type_die)) { state->regs[dst->reg1].type = type_die; state->regs[dst->reg1].ok = true; } else From patchwork Thu Oct 12 03:51:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418254 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5A4A72109; Thu, 12 Oct 2023 03:52:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="X3SM2v4c" Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9AB8619B0; Wed, 11 Oct 2023 20:52:17 -0700 (PDT) Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-1c60778a3bfso4900275ad.1; Wed, 11 Oct 2023 20:52:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082736; x=1697687536; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=gNrNBiU6krvSQLSVIZCNwHSxv8ln34r7JWtg5yyof2s=; b=X3SM2v4cmKQL1Y2lAPBJTCTFMX1jatJtw7ICKlFFv/Xo8pAHs2P7BEFLgXpJJjnjnA YRPrmuy8ipPMPuU/BE55/ruNkCg+++XnucRYCyUDoovxWiPW7b35o4CTbJLMJU7NQWnz t9l2BXXhNzZvx86/CPdJhKuXvBL6dw1n0NJtkl9a5gWWAGrQsBSDpv/eBTmXPHtxlOJj azhIvhvSblmSDrR1ujHpSBE3a8DWd3O9aRhmHzQmu7stSLYj+bnEE/2WiLK+hNrwkaC0 t1Sh+x+BPPFexsPF7sN3/22fg0Q2e1rwqxu6MJEdVdiJYikJFwYskCqKccfnbKSo4XED Y0YA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082736; x=1697687536; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=gNrNBiU6krvSQLSVIZCNwHSxv8ln34r7JWtg5yyof2s=; b=AtpzxzlZKOiyqH1oKqoss4ylJH3iC24McqUYv8l0ij2f8TkV7NQl8VITeab2gz+fB4 tWI5YFJpVgaNm7D3bZNKbwiRAt3ZyY4jNEk/nxNeFckDruRf4e6j7xvOFFKyE8WwOTUE g5Fcg7taIzsKR6mgqWhbCFF9exjNu5BMYjZs6Ba9krciPwiyuzpzz/Fx9vad0vua/nwe 3fxxY8qO4DO1QG9pmyC5ScMxr8vUnsbqMZyvcSQUD9DKsOT2s7c6r5EoIWnZLJsbAq7W RLN5LcI4Gcpfu3g5WCIc5XI7dEMvkLhmDflOlLS+shJpgq+21WzglvP549FRf97MRCqE n9Cg== X-Gm-Message-State: AOJu0YywyswJjXN7eimw7C7CFuEjwN5W4gZXhBXv3rRKER+Bz8BHQvjs 7Bg6Mm6pqM1Zx1jXYUAh4f8= X-Google-Smtp-Source: AGHT+IEvoucq5vGkiukqULI5yroAc9jrfO3u4Qb6tislqrRUkaUQHapZOolBoBQ4uiK96KmMoOm+Qg== X-Received: by 2002:a17:902:e9c6:b0:1c9:ccbc:4ca5 with SMTP id 6-20020a170902e9c600b001c9ccbc4ca5mr3499059plk.60.1697082736192; Wed, 11 Oct 2023 20:52:16 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:15 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 47/48] perf annotate-data: Add stack canary type Date: Wed, 11 Oct 2023 20:51:10 -0700 Message-ID: <20231012035111.676789-48-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net When the stack protector is enabled, compiler would generate code to check stack overflow with a special value called 'stack carary' at runtime. On x86_64, GCC hard-codes the stack canary as %gs:40. While there's a definition of fixed_percpu_data in asm/processor.h, it seems that the header is not included everywhere and many places it cannot find the type info. As it's in the well-known location (at %gs:40), let's add a pseudo stack canary type to handle it specially. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 7 +++++++ tools/perf/util/annotate-data.h | 1 + tools/perf/util/annotate.c | 17 +++++++++++++++++ 3 files changed, 25 insertions(+) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 677dc01432d3..68d7d207e2f7 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -37,6 +37,13 @@ struct annotated_data_type stackop_type = { }, }; +struct annotated_data_type canary_type = { + .self = { + .type_name = (char *)"(stack canary)", + .children = LIST_HEAD_INIT(canary_type.self.children), + }, +}; + /* Data type collection debug statistics */ struct annotated_data_stat ann_data_stat; diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index 0bfef29fa52c..e293980eb11b 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -77,6 +77,7 @@ struct annotated_data_type { extern struct annotated_data_type unknown_type; extern struct annotated_data_type stackop_type; +extern struct annotated_data_type canary_type; /** * struct data_loc_info - Data location information diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 414ae45b7c06..f343f90612d0 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3720,6 +3720,17 @@ static bool is_stack_operation(struct arch *arch, struct disasm_line *dl) return false; } +static bool is_stack_canary(struct arch *arch, struct annotated_op_loc *loc) +{ + /* On x86_64, %gs:40 is used for stack canary */ + if (arch__is(arch, "x86")) { + if (loc->segment == INSN_SEG_X86_GS && loc->offset == 40) + return true; + } + + return false; +} + u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset, struct disasm_line *dl) { @@ -3889,6 +3900,12 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he) } mem_type = find_data_type(&dloc); + + if (mem_type == NULL && is_stack_canary(arch, op_loc)) { + mem_type = &canary_type; + dloc.type_offset = 0; + } + if (mem_type) istat->good++; else From patchwork Thu Oct 12 03:51:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13418252 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 255162109; Thu, 12 Oct 2023 03:52:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bSAoOHO3" Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1E7C19BE; Wed, 11 Oct 2023 20:52:19 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1c9d922c039so4500475ad.3; Wed, 11 Oct 2023 20:52:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697082738; x=1697687538; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=QS9t+1UxNgjEpS8viLjdstjzbOva5QnEdVrobE/OTdA=; b=bSAoOHO33tol/ESOxDmDvL/TQKPwfY5KygPPBHNi9SEkU8RsObyvpyRnEi8WfQZfSS 9hZSJI3IS60/9GRGd4ag9qnhiPsCnDEKFOZ4LlesUzuG5X5hY+ue4NRfvF+h2hIghVW8 BeHC9BwOXONma8OcURk0Zs2VcqtRlQXI4/UuU6ucviragmUjlbFoIwj9gX532QVlJm3d Xba5gViBlZUBhE65cNw15gr4N7hpVIsSNtbBLYJvhJlgEByqPuGXcZ9GlbEimWaQ8oUe 9O2dLRxf4/sxyktg5L+iFWj+tC32+kVhCTuLOWWMaSzDWzz4o3aEwfMl8MMZtuc3ztpN lCfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697082738; x=1697687538; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=QS9t+1UxNgjEpS8viLjdstjzbOva5QnEdVrobE/OTdA=; b=W2mX3hrUnG4rPb/t4d978qDHLhvccVx/J1dfd6cdaH6wE1l5QgIF2b+VeC1RULoHJh itrXDzlFGolC7eKhkxY7IcMVeWldJ/aYcuUVdR1DIOahX3EiIGpLxHuviWv+TmjGH1Z7 KpyKXKJR2l1PruxOn2LA9sdw3azJvX5NRz5+ItHAhfbSg0b5wTXtdQBD1MNhNwdNMLw6 +BFnuUr+mC74kOtZZHNCFfiy1oLtYvdJFBjztIVdFUPp6qo1V16nvrM+m2+W0LkapVuo 2DRd2KyKDEXRsByw+Y7nD5TMXpcjQKwCqrko4ZVL3xCiSnTePUoVG5c0DAmD26LwS5SJ JJrQ== X-Gm-Message-State: AOJu0YzB3h1d1ivphBiBMJJBXdppHdfU3Rafg5HHmhF2nHwgzOYMrtT+ VrjK4+SMPvJBDV8xFPRAMTY= X-Google-Smtp-Source: AGHT+IEXFEe8Cjy4EdVwkH80U06BM1WbOGfr6H7DU9pFZIgM7XFYxTFu9zlb6xkgBUsADflizhSD5A== X-Received: by 2002:a17:902:7893:b0:1bc:6c8:cded with SMTP id q19-20020a170902789300b001bc06c8cdedmr21435579pll.67.1697082737575; Wed, 11 Oct 2023 20:52:17 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:b1b9:d490:2f5e:be06]) by smtp.gmail.com with ESMTPSA id w8-20020a170902d70800b001bc18e579aesm711374ply.101.2023.10.11.20.52.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 20:52:17 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra Cc: Ian Rogers , Adrian Hunter , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Linus Torvalds , Stephane Eranian , Masami Hiramatsu , linux-toolchains@vger.kernel.org, linux-trace-devel@vger.kernel.org Subject: [PATCH 48/48] perf annotate-data: Add debug message Date: Wed, 11 Oct 2023 20:51:11 -0700 Message-ID: <20231012035111.676789-49-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012035111.676789-1-namhyung@kernel.org> References: <20231012035111.676789-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net This is just for debugging and not for merge. Signed-off-by: Namhyung Kim --- tools/perf/util/annotate-data.c | 122 +++++++++++++++++++++++++++++--- tools/perf/util/annotate-data.h | 2 +- 2 files changed, 114 insertions(+), 10 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 68d7d207e2f7..bb0ad26e704d 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -115,6 +115,21 @@ void exit_type_state(struct type_state *state) } } +static void debug_print_type_name(Dwarf_Die *die) +{ + struct strbuf sb; + char *str; + + if (!verbose) + return; + + strbuf_init(&sb, 32); + __die_get_typename(die, &sb); + str = strbuf_detach(&sb, NULL); + pr_debug("%s (die:%lx)\n", str, dwarf_dieoffset(die)); + free(str); +} + /* * Compare type name and size to maintain them in a tree. * I'm not sure if DWARF would have information of a single type in many @@ -401,7 +416,7 @@ static struct type_state_stack *findnew_stack_state(struct type_state *state, * is used only at the given location and updates an entry in the table. */ void update_var_state(struct type_state *state, struct data_loc_info *dloc, - u64 addr, struct die_var_type *var_types) + u64 addr, u64 off, struct die_var_type *var_types) { Dwarf_Die mem_die; struct die_var_type *var; @@ -422,14 +437,20 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc, if (var->reg == DWARF_REG_FB) { findnew_stack_state(state, var->offset, &mem_die); + pr_debug("var [%lx] stack fbreg (%x, %d) type=", off, var->offset, var->offset); + debug_print_type_name(&mem_die); } else if (var->reg == fbreg) { findnew_stack_state(state, var->offset - fb_offset, &mem_die); + pr_debug("var [%lx] stack cfa (%x, %d) fb-offset=%d type=", off, var->offset - fb_offset, var->offset - fb_offset, fb_offset); + debug_print_type_name(&mem_die); } else if (has_reg_type(state, var->reg) && var->offset == 0) { struct type_state_reg *reg; reg = &state->regs[var->reg]; reg->type = mem_die; reg->ok = true; + pr_debug("var [%lx] reg%d type=", off, var->reg); + debug_print_type_name(&mem_die); } } } @@ -509,6 +530,8 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, die_get_real_type(&func_die, &type_die)) { state->regs[state->ret_reg].type = type_die; state->regs[state->ret_reg].ok = true; + pr_debug("fun [%lx] reg0 return from %s type=", dl->al.offset, dwarf_diename(&func_die)); + debug_print_type_name(&type_die); } return; } @@ -517,8 +540,10 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, if (!strstr(dl->ins.name, "mov")) return; - if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0) + if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0) { + pr_debug("failed to get mov insn loc\n"); return; + } if (dloc->fb_cfa) { u64 ip = dloc->ms->sym->start + dl->al.offset; @@ -533,10 +558,14 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, if (!has_reg_type(state, dst->reg1)) return; - if (has_reg_type(state, src->reg1)) + if (has_reg_type(state, src->reg1)) { state->regs[dst->reg1] = state->regs[src->reg1]; - else if (dloc->ms->map->dso->kernel && - src->segment == INSN_SEG_X86_GS) { + if (state->regs[dst->reg1].ok) { + pr_debug("mov [%lx] reg%d -> reg%d type=", dl->al.offset, src->reg1, dst->reg1); + debug_print_type_name(&state->regs[dst->reg1].type); + } + } else if (dloc->ms->map->dso->kernel && + src->segment == INSN_SEG_X86_GS) { struct map_symbol *ms = dloc->ms; int offset = src->offset; u64 ip = ms->sym->start + dl->al.offset; @@ -556,6 +585,8 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, var_name, offset, &type_die)) { state->regs[dst->reg1].type = type_die; state->regs[dst->reg1].ok = true; + pr_debug("mov [%lx] percpu -> reg%d type=", dl->al.offset, dst->reg1); + debug_print_type_name(&state->regs[dst->reg1].type); } } else state->regs[dst->reg1].ok = false; @@ -586,8 +617,13 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, var_name, offset, &type_die)) { state->regs[dst->reg1].type = type_die; state->regs[dst->reg1].ok = true; - } else + pr_debug("mov [%lx] PC-rel -> reg%d type=", dl->al.offset, dst->reg1); + debug_print_type_name(&type_die); + } else { + if (var_name) + pr_debug("??? [%lx] PC-rel (%lx: %s%+d)\n", dl->al.offset, var_addr, var_name, offset); state->regs[dst->reg1].ok = false; + } } /* And check stack variables with offset */ else if (sreg == fbreg) { @@ -600,6 +636,8 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, &type_die)) { state->regs[dst->reg1].type = type_die; state->regs[dst->reg1].ok = true; + pr_debug("mov [%lx] stack (-%#x, %d) -> reg%d type=", dl->al.offset, -offset, offset, dst->reg1); + debug_print_type_name(&type_die); } else state->regs[dst->reg1].ok = false; } @@ -609,6 +647,8 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, src->offset, &type_die)) { state->regs[dst->reg1].type = type_die; state->regs[dst->reg1].ok = true; + pr_debug("mov [%lx] %#x(reg%d) -> reg%d type=", dl->al.offset, src->offset, sreg, dst->reg1); + debug_print_type_name(&type_die); } /* Or try another register if any */ else if (src->multi_regs && sreg == src->reg1 && @@ -648,6 +688,8 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc, findnew_stack_state(state, offset, &state->regs[src->reg1].type); } + pr_debug("mov [%lx] reg%d -> stack (-%#x, %d) type=", dl->al.offset, src->reg1, -offset, offset); + debug_print_type_name(&state->regs[src->reg1].type); } /* * Ignore other transfers since it'd set a value in a struct @@ -751,6 +793,9 @@ static bool find_matching_type(struct type_state *state, (unsigned)dloc->type_offset >= size) return false; + pr_debug("%s: [%lx] reg=%d offset=%d type=", + __func__, dloc->ip - dloc->ms->sym->start, reg, dloc->type_offset); + debug_print_type_name(type_die); return true; } @@ -764,6 +809,10 @@ static bool find_matching_type(struct type_state *state, *type_die = stack->type; /* Update the type offset from the start of slot */ dloc->type_offset -= stack->offset; + + pr_debug("%s: [%lx] stack offset=%d type=", + __func__, dloc->ip - dloc->ms->sym->start, dloc->type_offset); + debug_print_type_name(type_die); return true; } @@ -785,6 +834,11 @@ static bool find_matching_type(struct type_state *state, *type_die = stack->type; /* Update the type offset from the start of slot */ dloc->type_offset -= fboff + stack->offset; + + pr_debug("%s: [%lx] cfa stack offset=%d type_offset=%d type=", + __func__, dloc->ip - dloc->ms->sym->start, + dloc->type_offset + stack->offset, dloc->type_offset); + debug_print_type_name(type_die); return true; } @@ -808,12 +862,13 @@ static bool find_data_type_insn(struct data_loc_info *dloc, int reg, list_for_each_entry(bb, basic_blocks, list) { struct disasm_line *dl = bb->begin; + pr_debug("bb: [%lx - %lx]\n", bb->begin->al.offset, bb->end->al.offset); list_for_each_entry_from(dl, ¬es->src->source, al.node) { u64 this_ip = sym->start + dl->al.offset; u64 addr = map__rip_2objdump(dloc->ms->map, this_ip); /* Update variable type at this address */ - update_var_state(&state, dloc, addr, var_types); + update_var_state(&state, dloc, addr, dl->al.offset, var_types); if (this_ip == dloc->ip) { found = find_matching_type(&state, dloc, reg, @@ -846,6 +901,16 @@ static int find_data_type_block(struct data_loc_info *dloc, int reg, u64 src_ip, dst_ip; int ret = -1; + if (dloc->fb_cfa) { + u64 pc = map__rip_2objdump(dloc->ms->map, dloc->ip); + int fbreg, fboff; + + if (die_get_cfa(dloc->di->dbg, pc, &fbreg, &fboff) < 0) + fbreg = -1; + + pr_debug("CFA reg=%d offset=%d\n", fbreg, fboff); + } + dst_ip = dloc->ip; for (int i = nr_scopes - 1; i >= 0; i--) { Dwarf_Addr base, start, end; @@ -854,12 +919,16 @@ static int find_data_type_block(struct data_loc_info *dloc, int reg, if (dwarf_ranges(&scopes[i], 0, &base, &start, &end) < 0) break; + pr_debug("scope: [%d/%d] (die:%lx)\n", i + 1, nr_scopes, dwarf_dieoffset(&scopes[i])); src_ip = map__objdump_2rip(dloc->ms->map, start); /* Get basic blocks for this scope */ if (annotate_get_basic_blocks(dloc->ms->sym, src_ip, dst_ip, - &this_blocks) < 0) + &this_blocks) < 0) { + pr_debug("cannot find a basic block from %lx to %lx\n", + src_ip - dloc->ms->sym->start, dst_ip - dloc->ms->sym->start); continue; + } prepend_basic_blocks(&this_blocks, &basic_blocks); /* Get variable info for this scope and add to var_types list */ @@ -895,6 +964,18 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die) int fb_offset = 0; bool is_fbreg = false; u64 pc; + char buf[64]; + + if (dloc->op->multi_regs) + snprintf(buf, sizeof(buf), " or reg%d", dloc->op->reg2); + else if (dloc->op->reg1 == DWARF_REG_PC) + snprintf(buf, sizeof(buf), " (PC)"); + else + buf[0] = '\0'; + + pr_debug("-----------------------------------------------------------\n"); + pr_debug("%s [%lx] for reg%d%s in %s\n", __func__, dloc->ip - dloc->ms->sym->start, + dloc->op->reg1, buf, dloc->ms->sym->name); /* * IP is a relative instruction address from the start of the map, as @@ -913,11 +994,15 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die) reg = loc->reg1; offset = loc->offset; + pr_debug("CU die offset: %lx\n", dwarf_dieoffset(&cu_die)); + if (reg == DWARF_REG_PC) { if (die_find_variable_by_addr(&cu_die, pc, dloc->var_addr, &var_die, &offset)) { ret = check_variable(&var_die, type_die, offset, /*is_pointer=*/false); + if (ret == 0) + pr_debug("found PC-rel by addr=%lx offset=%d\n", dloc->var_addr, offset); dloc->type_offset = offset; goto out; } @@ -926,6 +1011,8 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die) die_find_variable_at(&cu_die, dloc->var_name, pc, &var_die)) { ret = check_variable(&var_die, type_die, dloc->type_offset, /*is_pointer=*/false); + if (ret == 0) + pr_debug("found \"%s\" by name offset=%d\n", dloc->var_name, dloc->type_offset); /* dloc->type_offset was updated by the caller */ goto out; } @@ -978,6 +1065,21 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die) /* Found a variable, see if it's correct */ ret = check_variable(&var_die, type_die, offset, reg != DWARF_REG_PC && !is_fbreg); + if (ret == 0) { +#if 0 + const char *filename; + int lineno; + + if (cu_find_lineinfo(&cu_die, pc, &filename, &lineno) < 0) { + filename = "unknown"; + lineno = 0; + } +#endif + pr_debug("found \"%s\" in scope=%d/%d reg=%d offset=%#x (%d) loc->offset=%d fb-offset=%d (die:%lx scope:%lx) type=", + dwarf_diename(&var_die), i+1, nr_scopes, reg, offset, offset, loc->offset, fb_offset, dwarf_dieoffset(&var_die), + dwarf_dieoffset(&scopes[i])/*, filename, lineno*/); + debug_print_type_name(type_die); + } dloc->type_offset = offset; goto out; } @@ -994,8 +1096,10 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die) goto retry; } - if (ret < 0) + if (ret < 0) { + pr_debug("no variable found\n"); ann_data_stat.no_var++; + } out: free(scopes); diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index e293980eb11b..44e0f3770432 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -166,7 +166,7 @@ void exit_type_state(struct type_state *state); /* Update type state table using variables */ void update_var_state(struct type_state *state, struct data_loc_info *dloc, - u64 addr, struct die_var_type *var_types); + u64 addr, u64 off, struct die_var_type *var_types); /* Update type state table for an instruction */ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,