From patchwork Fri Jun 17 11:18:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shuai Xue X-Patchwork-Id: 12885603 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C0760C433EF for ; Fri, 17 Jun 2022 11:25:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=QdNsDlWeD0yhM0VI2xyfhBvyvutRmUASNn30MrfRssQ=; b=dFp5sm/Zaiwna1 VkZV1i9I7HfLzJYBRv0aIFr1d6baaThQ6OcVpY8udN31sRAAO3SEKbHM/jB1iMdodhSCA0CJmtZGC 6nwiIA0hq193gaSZdoSDLliOpP/mPdQjm0JRUh3rdSU07wUj51e92/pdu21AG8Ug+CsJxdfOLtWM1 YJjkXF7eWabIRaxBRD2X6PhjeNzAzjTGngh+8i3gRv9Ht6poXOElqwdqpzQGHp4Boi64VZxNn1nlE 5fpmu8dp1yS0Ow5hhKp9MPAra0UbU3Ewgxs7uJPpV7KVlKQqDZzDsD/RNR8laBJmCDUh6Xnmza6fX StUBEFe+YSQ9gTggth/w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1o2A57-007JY6-RQ; Fri, 17 Jun 2022 11:24:05 +0000 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1o29zw-007H2h-6I for linux-arm-kernel@lists.infradead.org; Fri, 17 Jun 2022 11:18:46 +0000 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R751e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0VGernk5_1655464716; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VGernk5_1655464716) by smtp.aliyun-inc.com; Fri, 17 Jun 2022 19:18:37 +0800 From: Shuai Xue To: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com Subject: [PATCH v1 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Date: Fri, 17 Jun 2022 19:18:23 +0800 Message-Id: <20220617111825.92911-2-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220617_041844_477235_16A970D6 X-CRM114-Status: GOOD ( 16.25 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Alibaba's T-Head SoC implements uncore PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: Shuai Xue Reported-by: kernel test robot --- .../admin-guide/perf/alibaba_pmu.rst | 94 +++++++++++++++++++ Documentation/admin-guide/perf/index.rst | 1 + 2 files changed, 95 insertions(+) create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation/admin-guide/perf/alibaba_pmu.rst new file mode 100644 index 000000000000..337f4f1d4c54 --- /dev/null +++ b/Documentation/admin-guide/perf/alibaba_pmu.rst @@ -0,0 +1,94 @@ +============================================================= +Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU) +============================================================= + +The Yitian 710, custom-built by Alibaba Group's chip development business, +T-Head, implements uncore PMU for performance and functional debugging to +facilitate system maintenance. + +DDR Sub-System Driveway (DRW) PMU Driver +========================================= + +The Yitian 710 SoC supports the most advanced DDR5/4 DRAM to provide +tremendous memory bandwidth for cloud computing and HPC. The Driveway is a +module which is a bridge between a router of mesh network and memory +controller. It provides various functions like secure control, address map +and so on. + +Yitian 710 employs eight DDR5/4 channels, four on each die. Each channel is +independent of others to service system memory requests. And one DDR5 +channel is split into two independent sub-channels. The DDR Sub-System +Driveway implements separate PMUs for each sub-channel to monitor various +performance metrics. + +The Driveway PMU devices are named as ali_drw_ with perf. +For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two +sub-channels of the same channel in die 0. And the PMU device of die 1 is +prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. + +Each sub-channel has 36 PMU counters in total, which is classified into +four groups: + +- Group 0: PMU Cycle Counter. This group has one pair of counters + pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count + based on DDRC core clock. + +- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used + to count the total access number of either the eight bank groups in a + selected rank, or four ranks separately in the first 4 counters. The base + transfer unit is 64B. + +- Group 2: PMU Retry Counters. This group has 10 counters, that intend to + count the total retry number of each type of uncorrectable error. + +- Group 3: PMU Common Counters. This group has 16 counters, that are used + to count the common events. + +For now, the Driveway PMU driver only uses counters in group 0 and group 3. + +Example usage of counting memory data bandwidth:: + + perf stat \ + -e ali_drw_21000/perf_hif_wr/ \ + -e ali_drw_21000/perf_hif_rd/ \ + -e ali_drw_21000/perf_hif_rmw/ \ + -e ali_drw_21000/perf_cycle/ \ + -e ali_drw_21080/perf_hif_wr/ \ + -e ali_drw_21080/perf_hif_rd/ \ + -e ali_drw_21080/perf_hif_rmw/ \ + -e ali_drw_21080/perf_cycle/ \ + -e ali_drw_23000/perf_hif_wr/ \ + -e ali_drw_23000/perf_hif_rd/ \ + -e ali_drw_23000/perf_hif_rmw/ \ + -e ali_drw_23000/perf_cycle/ \ + -e ali_drw_23080/perf_hif_wr/ \ + -e ali_drw_23080/perf_hif_rd/ \ + -e ali_drw_23080/perf_hif_rmw/ \ + -e ali_drw_23080/perf_cycle/ \ + -e ali_drw_25000/perf_hif_wr/ \ + -e ali_drw_25000/perf_hif_rd/ \ + -e ali_drw_25000/perf_hif_rmw/ \ + -e ali_drw_25000/perf_cycle/ \ + -e ali_drw_25080/perf_hif_wr/ \ + -e ali_drw_25080/perf_hif_rd/ \ + -e ali_drw_25080/perf_hif_rmw/ \ + -e ali_drw_25080/perf_cycle/ \ + -e ali_drw_27000/perf_hif_wr/ \ + -e ali_drw_27000/perf_hif_rd/ \ + -e ali_drw_27000/perf_hif_rmw/ \ + -e ali_drw_27000/perf_cycle/ \ + -e ali_drw_27080/perf_hif_wr/ \ + -e ali_drw_27080/perf_hif_rd/ \ + -e ali_drw_27080/perf_hif_rmw/ \ + -e ali_drw_27080/perf_cycle/ -- sleep 10 + +The average DRAM bandwidth can be calculated as follows: + +- Read Bandwidth = perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle +- Write Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle + +Here, DDRC_WIDTH = 64 bytes. + +The current driver does not support sampling. So "perf record" is +unsupported. Also attach to a task is unsupported as the events are all +uncore. diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst index 69b23f087c05..bf466ae91c6c 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -17,3 +17,4 @@ Performance monitor support xgene-pmu arm_dsu_pmu thunderx2-pmu + thead_pmu