From patchwork Thu Oct 12 17:56:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ian Rogers X-Patchwork-Id: 13419583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ED682CDB47E for ; Thu, 12 Oct 2023 17:57:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:To:From:Subject:Mime-Version:Message-Id :Date:Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=S7VpaXCUEfFxpXOZ0+DlL9+MQqDcnSe7E4Xc2PGc8KE=; b=pkDPxE4tc+jKxa EUktuXtG6DmlVjUMub3ygobMCGSanoD96CnAqJGDBHIF4byYbp/orrUicsSU0kzgqC1lBlUE0XVu5 r5u6J8ab5vFVwjodH+wj5KziDxtO6T6eD3NL9+stASb9Zy0R38LLhUf2gUguU14sNgN6dmPCOmN+n eJV9ue0IdYvHYRfvuDXW5iW/RLVxTDlfvL/GfS1iV6nsFv8l/qWLyTpMfxIBWPDXlvRFlGdSi+C5f jw9173KybNa9cPNTCQQfRBP/eH+xICHsxQ8EcznqlVGABSGdyXGm+YXJdyqNvkcgP2WrazCikMzm/ aOT7AIOt/STSpCvCp6yQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qqzvd-001ZIa-0i; Thu, 12 Oct 2023 17:56:57 +0000 Received: from mail-yw1-x114a.google.com ([2607:f8b0:4864:20::114a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qqzvZ-001ZFc-0H for linux-arm-kernel@lists.infradead.org; Thu, 12 Oct 2023 17:56:54 +0000 Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5a7dd655566so19481327b3.2 for ; Thu, 12 Oct 2023 10:56:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697133409; x=1697738209; darn=lists.infradead.org; h=to:from:subject:mime-version:message-id:date:from:to:cc:subject :date:message-id:reply-to; bh=WBYNCwMRWf2j/q/jHdVPvtgHF8EEuq5iigzIRPIKj74=; b=Ib3o3ZWYUGCTgjSBTxZm21tWXivBc1DNR+ymE74QS1l0py8A/aimy57oQ1HFaUWMCf JhQV+ooACReKbcySu1jBWdS8quOE4rfr9bsMMGHcDM3WQSYhVuwzma60IidlHQLB84VF MjTgxwXmBAM+2t/SwVmQKMCRDiRrq0KEznDpU6iwwLVWaiY/ZmekbPGm/swtWBUnoCpa SZvjniCrr4jwMyy8cRRkeuXyDl2P+RUAZEK1HxpI369HmTm/prJ09omcvzDg6VvxxPAz PIP+Vxf1GoZYYWKGMQo33wOt4d8ychXw99Fx+374qc1fOJ7h0VCgEmp0PLCaniwtWJ9g vi/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697133409; x=1697738209; h=to:from:subject:mime-version:message-id:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=WBYNCwMRWf2j/q/jHdVPvtgHF8EEuq5iigzIRPIKj74=; b=K4QUor0eSObod2JsCFErv54YwUyFa/r3498ZrQAjuw60viqSmpH6qopG8ImYznnhf7 ZKfb7/EKrHyf9lP+RTNSEglI4sUC1duAlqelPj/m/dOhdWVr22CKfDLI1M2sYC//pQuY JNypMSZtTlK5zWOFmpv5hscO716PWKuEV0oBIsSpTI9woT1omsORu4IbhtTO6LnXKcgs tDkVERin6maMlVNalI6vdzVVzx+iXLoZshAm+6leN5B7r4N57LMgHt9NjInSlG0sW/3N CQMB+6lYFjtdHipipJxhQWOG84Uht8MB9dTN/kzd4eU8a75s0HmGjtYMx2AgZsj638Fk Z6qw== X-Gm-Message-State: AOJu0YwZMrLtBiiRlGpE3AiWPBycHIdCchIhtc/eWev/yOetqdut16YZ 77Kcbdrbn8hj7GTo3dS3WJFq6OOy6YS3 X-Google-Smtp-Source: AGHT+IHw+89XI9SqD0f3m9NX2o1w09tj2COvD8HCR0n2aQj6ML6m/aLtOYL6vAFj+X7KkMX5yLAMWHeWEz1n X-Received: from irogers.svl.corp.google.com ([2620:15c:2a3:200:6a89:babc:124b:e4e6]) (user=irogers job=sendgmr) by 2002:a81:bc0d:0:b0:58c:b45f:3e94 with SMTP id a13-20020a81bc0d000000b0058cb45f3e94mr498616ywi.8.1697133409197; Thu, 12 Oct 2023 10:56:49 -0700 (PDT) Date: Thu, 12 Oct 2023 10:56:38 -0700 Message-Id: <20231012175645.1849503-1-irogers@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog Subject: [PATCH v2 0/7] PMU performance improvements From: Ian Rogers To: Suzuki K Poulose , Mike Leach , James Clark , Leo Yan , John Garry , Will Deacon , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , Thomas Richter , Ravi Bangoria , Kajol Jain , Jing Zhang , Kan Liang , Yang Jihong , coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231012_105653_125259_CF0D9A61 X-CRM114-Status: GOOD ( 13.92 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Performance improvements to pmu scanning by holding onto the event/metric tables for a cpuid (avoid regular expression comparisons) and by lazily computing the default perf_event_attr for a PMU. Before % Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 251.990 usec (+- 4.009 usec) Average PMU scanning took: 3222.460 usec (+- 211.234 usec) % Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 260.120 usec (+- 7.905 usec) Average PMU scanning took: 3228.995 usec (+- 211.196 usec) % Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 252.310 usec (+- 3.980 usec) Average PMU scanning took: 3220.675 usec (+- 210.844 usec) After: % Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 28.530 usec (+- 0.602 usec) Average PMU scanning took: 275.725 usec (+- 18.253 usec) % Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 28.720 usec (+- 0.446 usec) Average PMU scanning took: 271.015 usec (+- 18.762 usec) % Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 31.040 usec (+- 0.612 usec) Average PMU scanning took: 267.340 usec (+- 17.209 usec) Measuring the pmu-scan benchmark on a Tigerlake laptop: core PMU scanning is reduced to 11.5% of the previous execution time, all PMU scanning is reduced to 8.4% of the previous execution time. There is a 4.3% reduction in openat system calls. v2. Address feedback from Adrian Hunter and Yang Jihong to allow the caching to address varying CPUIDs per PMU (currently an ARM64 only feature) and to cache when there is no table to return. Ian Rogers (7): perf pmu: Rename perf_pmu__get_default_config to perf_pmu__arch_init perf intel-pt: Move PMU initialization from default config code perf arm-spe: Move PMU initialization from default config code perf pmu: Const-ify file APIs perf pmu: Const-ify perf_pmu__config_terms perf pmu-events: Remember the perf_events_map for a PMU perf pmu: Lazily compute default config tools/perf/arch/arm/util/cs-etm.c | 13 +--- tools/perf/arch/arm/util/pmu.c | 10 +-- tools/perf/arch/arm64/util/arm-spe.c | 48 ++++++------ tools/perf/arch/s390/util/pmu.c | 3 +- tools/perf/arch/x86/util/intel-pt.c | 27 +++---- tools/perf/arch/x86/util/pmu.c | 6 +- tools/perf/pmu-events/jevents.py | 109 +++++++++++++++++---------- tools/perf/util/arm-spe.h | 4 +- tools/perf/util/cs-etm.h | 2 +- tools/perf/util/intel-pt.h | 3 +- tools/perf/util/parse-events.c | 12 +-- tools/perf/util/pmu.c | 38 +++++----- tools/perf/util/pmu.h | 22 +++--- tools/perf/util/python.c | 2 +- 14 files changed, 160 insertions(+), 139 deletions(-)