From patchwork Fri Nov 2 15:25:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jordan Crouse X-Patchwork-Id: 10665649 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 93A64109C for ; Fri, 2 Nov 2018 15:25:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 83947288A9 for ; Fri, 2 Nov 2018 15:25:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 778012B5CA; Fri, 2 Nov 2018 15:25:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 07B0628D72 for ; Fri, 2 Nov 2018 15:25:38 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CFF83891F8; Fri, 2 Nov 2018 15:25:33 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by gabe.freedesktop.org (Postfix) with ESMTPS id A7688891F8; Fri, 2 Nov 2018 15:25:32 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 5D5D76021C; Fri, 2 Nov 2018 15:25:32 +0000 (UTC) Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id A3EE66021C; Fri, 2 Nov 2018 15:25:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org A3EE66021C From: Jordan Crouse To: freedreno@lists.freedesktop.org Subject: [PATCH 01/10] drm/msm: Update generated headers Date: Fri, 2 Nov 2018 09:25:17 -0600 Message-Id: <20181102152526.23854-2-jcrouse@codeaurora.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20181102152526.23854-1-jcrouse@codeaurora.org> References: <20181102152526.23854-1-jcrouse@codeaurora.org> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-msm@vger.kernel.org, hoegsberg@chromium.org, smasetty@codeaurora.org, dri-devel@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Resync the a6xx generated headers from the database. Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/adreno/a6xx.xml.h | 54 +++++++++++++++++++++------ 1 file changed, 42 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx.xml.h b/drivers/gpu/drm/msm/adreno/a6xx.xml.h index a6f7c40454a6..3b04110308ec 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx.xml.h +++ b/drivers/gpu/drm/msm/adreno/a6xx.xml.h @@ -8,17 +8,17 @@ This file was generated by the rules-ng-ng headergen tool in this git repository git clone https://github.com/freedreno/envytools.git The rules-ng-ng source files this header was generated from are: -- /home/robclark/src/envytools/rnndb/adreno.xml ( 501 bytes, from 2018-07-03 19:37:13) -- /home/robclark/src/envytools/rnndb/freedreno_copyright.xml ( 1572 bytes, from 2018-07-03 19:37:13) -- /home/robclark/src/envytools/rnndb/adreno/a2xx.xml ( 36805 bytes, from 2018-07-03 19:37:13) -- /home/robclark/src/envytools/rnndb/adreno/adreno_common.xml ( 13634 bytes, from 2018-07-03 19:37:13) -- /home/robclark/src/envytools/rnndb/adreno/adreno_pm4.xml ( 42585 bytes, from 2018-10-04 19:06:37) -- /home/robclark/src/envytools/rnndb/adreno/a3xx.xml ( 83840 bytes, from 2018-07-03 19:37:13) -- /home/robclark/src/envytools/rnndb/adreno/a4xx.xml ( 112086 bytes, from 2018-07-03 19:37:13) -- /home/robclark/src/envytools/rnndb/adreno/a5xx.xml ( 147240 bytes, from 2018-10-04 19:06:37) -- /home/robclark/src/envytools/rnndb/adreno/a6xx.xml ( 139581 bytes, from 2018-10-04 19:06:42) -- /home/robclark/src/envytools/rnndb/adreno/a6xx_gmu.xml ( 10431 bytes, from 2018-09-14 13:03:07) -- /home/robclark/src/envytools/rnndb/adreno/ocmem.xml ( 1773 bytes, from 2018-07-03 19:37:13) +- ./adreno.xml ( 501 bytes, from 2018-10-10 15:35:49) +- ./freedreno_copyright.xml ( 1572 bytes, from 2016-10-24 21:12:27) +- ./adreno/a2xx.xml ( 37936 bytes, from 2018-10-10 15:35:49) +- ./adreno/adreno_common.xml ( 14201 bytes, from 2018-10-10 15:35:49) +- ./adreno/adreno_pm4.xml ( 42864 bytes, from 2018-10-10 15:35:49) +- ./adreno/a3xx.xml ( 83840 bytes, from 2017-12-05 18:20:27) +- ./adreno/a4xx.xml ( 112086 bytes, from 2018-10-10 15:35:49) +- ./adreno/a5xx.xml ( 147240 bytes, from 2018-10-10 15:35:49) +- ./adreno/a6xx.xml ( 140514 bytes, from 2018-10-10 15:35:49) +- ./adreno/a6xx_gmu.xml ( 10431 bytes, from 2018-10-10 15:35:49) +- ./adreno/ocmem.xml ( 1773 bytes, from 2016-10-24 21:12:27) Copyright (C) 2013-2018 by the following authors: - Rob Clark (robclark) @@ -501,7 +501,7 @@ enum a6xx_vfd_perfcounter_select { PERF_VFDP_VS_STAGE_WAVES = 22, }; -enum a6xx_hslq_perfcounter_select { +enum a6xx_hlsq_perfcounter_select { PERF_HLSQ_BUSY_CYCLES = 0, PERF_HLSQ_STALL_CYCLES_UCHE = 1, PERF_HLSQ_STALL_CYCLES_SP_STATE = 2, @@ -2959,6 +2959,8 @@ static inline uint32_t A6XX_GRAS_SC_WINDOW_SCISSOR_BR_Y(uint32_t val) #define A6XX_GRAS_LRZ_CNTL_ENABLE 0x00000001 #define A6XX_GRAS_LRZ_CNTL_LRZ_WRITE 0x00000002 #define A6XX_GRAS_LRZ_CNTL_GREATER 0x00000004 +#define A6XX_GRAS_LRZ_CNTL_UNK3 0x00000008 +#define A6XX_GRAS_LRZ_CNTL_UNK4 0x00000010 #define REG_A6XX_GRAS_UNKNOWN_8101 0x00008101 @@ -2997,6 +2999,13 @@ static inline uint32_t A6XX_GRAS_LRZ_BUFFER_PITCH_ARRAY_PITCH(uint32_t val) #define REG_A6XX_GRAS_UNKNOWN_8110 0x00008110 #define REG_A6XX_GRAS_2D_BLIT_CNTL 0x00008400 +#define A6XX_GRAS_2D_BLIT_CNTL_COLOR_FORMAT__MASK 0x0000ff00 +#define A6XX_GRAS_2D_BLIT_CNTL_COLOR_FORMAT__SHIFT 8 +static inline uint32_t A6XX_GRAS_2D_BLIT_CNTL_COLOR_FORMAT(enum a6xx_color_fmt val) +{ + return ((val) << A6XX_GRAS_2D_BLIT_CNTL_COLOR_FORMAT__SHIFT) & A6XX_GRAS_2D_BLIT_CNTL_COLOR_FORMAT__MASK; +} +#define A6XX_GRAS_2D_BLIT_CNTL_SCISSOR 0x00010000 #define REG_A6XX_GRAS_2D_SRC_TL_X 0x00008401 #define A6XX_GRAS_2D_SRC_TL_X_X__MASK 0x00ffff00 @@ -3642,6 +3651,9 @@ static inline uint32_t A6XX_RB_WINDOW_OFFSET_Y(uint32_t val) #define REG_A6XX_RB_SAMPLE_COUNT_CONTROL 0x00008891 #define A6XX_RB_SAMPLE_COUNT_CONTROL_COPY 0x00000002 +#define REG_A6XX_RB_LRZ_CNTL 0x00008898 +#define A6XX_RB_LRZ_CNTL_ENABLE 0x00000001 + #define REG_A6XX_RB_UNKNOWN_88D0 0x000088d0 #define REG_A6XX_RB_BLIT_SCISSOR_TL 0x000088d1 @@ -3780,6 +3792,9 @@ static inline uint32_t A6XX_RB_2D_BLIT_CNTL_COLOR_FORMAT(enum a6xx_color_fmt val { return ((val) << A6XX_RB_2D_BLIT_CNTL_COLOR_FORMAT__SHIFT) & A6XX_RB_2D_BLIT_CNTL_COLOR_FORMAT__MASK; } +#define A6XX_RB_2D_BLIT_CNTL_SCISSOR 0x00010000 + +#define REG_A6XX_RB_UNKNOWN_8C01 0x00008c01 #define REG_A6XX_RB_2D_DST_INFO 0x00008c17 #define A6XX_RB_2D_DST_INFO_COLOR_FORMAT__MASK 0x000000ff @@ -4643,6 +4658,8 @@ static inline uint32_t A6XX_SP_FS_CONFIG_NSAMP(uint32_t val) #define REG_A6XX_SP_UNKNOWN_AB20 0x0000ab20 +#define REG_A6XX_SP_UNKNOWN_ACC0 0x0000acc0 + #define REG_A6XX_SP_UNKNOWN_AE00 0x0000ae00 #define REG_A6XX_SP_UNKNOWN_AE03 0x0000ae03 @@ -4700,11 +4717,20 @@ static inline uint32_t A6XX_SP_PS_2D_SRC_INFO_COLOR_SWAP(enum a3xx_color_swap va return ((val) << A6XX_SP_PS_2D_SRC_INFO_COLOR_SWAP__SHIFT) & A6XX_SP_PS_2D_SRC_INFO_COLOR_SWAP__MASK; } #define A6XX_SP_PS_2D_SRC_INFO_FLAGS 0x00001000 +#define A6XX_SP_PS_2D_SRC_INFO_FILTER 0x00010000 #define REG_A6XX_SP_PS_2D_SRC_LO 0x0000b4c2 #define REG_A6XX_SP_PS_2D_SRC_HI 0x0000b4c3 +#define REG_A6XX_SP_PS_2D_SRC_SIZE 0x0000b4c4 +#define A6XX_SP_PS_2D_SRC_SIZE_PITCH__MASK 0x01fffe00 +#define A6XX_SP_PS_2D_SRC_SIZE_PITCH__SHIFT 9 +static inline uint32_t A6XX_SP_PS_2D_SRC_SIZE_PITCH(uint32_t val) +{ + return ((val >> 6) << A6XX_SP_PS_2D_SRC_SIZE_PITCH__SHIFT) & A6XX_SP_PS_2D_SRC_SIZE_PITCH__MASK; +} + #define REG_A6XX_SP_PS_2D_SRC_FLAGS_LO 0x0000b4ca #define REG_A6XX_SP_PS_2D_SRC_FLAGS_HI 0x0000b4cb @@ -5365,5 +5391,9 @@ static inline uint32_t A6XX_CX_DBGC_CFG_DBGBUS_BYTEL_1_BYTEL15(uint32_t val) #define REG_A6XX_CX_DBGC_CFG_DBGBUS_TRACE_BUF2 0x00000030 +#define REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0 0x00000001 + +#define REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1 0x00000002 + #endif /* A6XX_XML */ From patchwork Fri Nov 2 15:25:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jordan Crouse X-Patchwork-Id: 10665655 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EEE3C109C for ; Fri, 2 Nov 2018 15:25:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DE4512B5D3 for ; Fri, 2 Nov 2018 15:25:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D1E742B5CA; Fri, 2 Nov 2018 15:25:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6C4F82B5D3 for ; Fri, 2 Nov 2018 15:25:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9281589359; Fri, 2 Nov 2018 15:25:35 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by gabe.freedesktop.org (Postfix) with ESMTPS id AD1088921B; Fri, 2 Nov 2018 15:25:32 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 910456086A; Fri, 2 Nov 2018 15:25:32 +0000 (UTC) Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 6B74660ADB; Fri, 2 Nov 2018 15:25:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 6B74660ADB From: Jordan Crouse To: freedreno@lists.freedesktop.org Subject: [PATCH 02/10] drm/msm/gpu: Allocate the correct size for the GPU memptrs Date: Fri, 2 Nov 2018 09:25:18 -0600 Message-Id: <20181102152526.23854-3-jcrouse@codeaurora.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20181102152526.23854-1-jcrouse@codeaurora.org> References: <20181102152526.23854-1-jcrouse@codeaurora.org> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-msm@vger.kernel.org, hoegsberg@chromium.org, smasetty@codeaurora.org, dri-devel@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Allocate the correct buffer size for the GPU memptrs. The incorrect size hasn't affected us thus far since the incorrect size was larger than the intended size and we're still stuck on page sized granularity anyway but technically correct is the best kind of correct. Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/msm_gpu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index ff958486819b..b7b4539c1731 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -920,7 +920,8 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev, goto fail; } - memptrs = msm_gem_kernel_new(drm, sizeof(*gpu->memptrs_bo), + memptrs = msm_gem_kernel_new(drm, + sizeof(struct msm_rbmemptrs) * nr_rings, MSM_BO_UNCACHED, gpu->aspace, &gpu->memptrs_bo, &memptrs_iova); From patchwork Fri Nov 2 15:25:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jordan Crouse X-Patchwork-Id: 10665659 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7611813BF for ; Fri, 2 Nov 2018 15:25:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 63E7328D72 for ; Fri, 2 Nov 2018 15:25:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 582B02B5CA; Fri, 2 Nov 2018 15:25:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1075028D72 for ; Fri, 2 Nov 2018 15:25:44 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0615189385; Fri, 2 Nov 2018 15:25:36 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6E01A891DC; Fri, 2 Nov 2018 15:25:33 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 553D760CF4; Fri, 2 Nov 2018 15:25:33 +0000 (UTC) Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 363BD60A34; Fri, 2 Nov 2018 15:25:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 363BD60A34 From: Jordan Crouse To: freedreno@lists.freedesktop.org Subject: [PATCH 03/10] drm/msm: Gracefully handle failure in _msm_gem_kernel_new Date: Fri, 2 Nov 2018 09:25:19 -0600 Message-Id: <20181102152526.23854-4-jcrouse@codeaurora.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20181102152526.23854-1-jcrouse@codeaurora.org> References: <20181102152526.23854-1-jcrouse@codeaurora.org> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-msm@vger.kernel.org, hoegsberg@chromium.org, smasetty@codeaurora.org, dri-devel@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP If any of the function calls in _msm_gem_kernel_new fail we need to make sure to dereference the GEM object with the appropriate function for the current locking state. Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/msm_gem.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 00c795ced02c..4646e9e45fc2 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -1041,23 +1041,29 @@ static void *_msm_gem_kernel_new(struct drm_device *dev, uint32_t size, if (iova) { ret = msm_gem_get_iova(obj, aspace, iova); - if (ret) { - drm_gem_object_put(obj); - return ERR_PTR(ret); - } + if (ret) + goto err; } vaddr = msm_gem_get_vaddr(obj); if (IS_ERR(vaddr)) { msm_gem_put_iova(obj, aspace); - drm_gem_object_put(obj); - return ERR_CAST(vaddr); + ret = PTR_ERR(vaddr); + goto err; } if (bo) *bo = obj; return vaddr; +err: + if (locked) + drm_gem_object_put(obj); + else + drm_gem_object_put_unlocked(obj); + + return ERR_PTR(ret); + } void *msm_gem_kernel_new(struct drm_device *dev, uint32_t size, From patchwork Fri Nov 2 15:25:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jordan Crouse X-Patchwork-Id: 10665657 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E999D109C for ; Fri, 2 Nov 2018 15:25:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D910528D72 for ; Fri, 2 Nov 2018 15:25:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CCC552B5CA; Fri, 2 Nov 2018 15:25:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5456B28D72 for ; Fri, 2 Nov 2018 15:25:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8410A89301; Fri, 2 Nov 2018 15:25:35 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4253E89265; Fri, 2 Nov 2018 15:25:34 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 2A72060C4B; Fri, 2 Nov 2018 15:25:34 +0000 (UTC) Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 04D8360C4D; Fri, 2 Nov 2018 15:25:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 04D8360C4D From: Jordan Crouse To: freedreno@lists.freedesktop.org Subject: [PATCH 04/10] drm/msm/gpu: Add per-submission statistics Date: Fri, 2 Nov 2018 09:25:20 -0600 Message-Id: <20181102152526.23854-5-jcrouse@codeaurora.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20181102152526.23854-1-jcrouse@codeaurora.org> References: <20181102152526.23854-1-jcrouse@codeaurora.org> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-msm@vger.kernel.org, hoegsberg@chromium.org, smasetty@codeaurora.org, dri-devel@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Add infrastructure to track statistics for GPU submissions by sampling certain perfcounters before and after a submission. To store the statistics, the per-ring memptrs region is expanded to include room for up to 64 entries - this should cover a reasonable amount of inflight submissions without worrying about losing data. The target specific code inserts PM4 commands to sample the counters before and after submission and store them in the data region. The CPU can access the data after the submission retires to make sense of the statistics and communicate them to the user. Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 20 +++++++++------- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 34 ++++++++++++++++++++------- drivers/gpu/drm/msm/msm_ringbuffer.h | 16 +++++++++++++ 3 files changed, 54 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 0a0ceb76e2ba..e816947ac7d8 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -346,16 +346,20 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu) ret = gmu_poll_timeout(gmu, REG_A6XX_RSCC_SEQ_BUSY_DRV0, val, !val, 100, 10000); - if (!ret) { - gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 0); - - /* Re-enable the power counter */ - gmu_write(gmu, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_ENABLE, 1); - return 0; + if (ret) { + DRM_DEV_ERROR(gmu->dev, "GPU RSC sequence stuck while waking up the GPU\n"); + return ret; } - DRM_DEV_ERROR(gmu->dev, "GPU RSC sequence stuck while waking up the GPU\n"); - return ret; + gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 0); + + /* Set up CX GMU counter 0 to count busy ticks */ + gmu_write(gmu, REG_A6XX_GPU_GMU_AO_GPU_CX_BUSY_MASK, 0xff000000); + gmu_rmw(gmu, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_SELECT_0, 0xff, 0x20); + + /* Enable the power counter */ + gmu_write(gmu, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_ENABLE, 1); + return 0; } static void a6xx_rpmh_stop(struct a6xx_gmu *gmu) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 38b7a5a92bfb..cf66edfb5246 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -67,13 +67,34 @@ static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring) gpu_write(gpu, REG_A6XX_CP_RB_WPTR, wptr); } +static void get_stats_counter(struct msm_ringbuffer *ring, u32 counter, + u64 iova) +{ + OUT_PKT7(ring, CP_REG_TO_MEM, 3); + OUT_RING(ring, counter | (1 << 30) | (2 << 18)); + OUT_RING(ring, lower_32_bits(iova)); + OUT_RING(ring, upper_32_bits(iova)); +} + static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit, struct msm_file_private *ctx) { + unsigned int index = submit->seqno % MSM_GPU_SUBMIT_STATS_COUNT; struct msm_drm_private *priv = gpu->dev->dev_private; struct msm_ringbuffer *ring = submit->ring; unsigned int i; + get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO, + rbmemptr_stats(ring, index, cpcycles_start)); + + /* + * For PM4 the GMU register offsets are calculated from the base of the + * GPU registers so we need to add 0x1a800 to the register value on A630 + * to get the right value from PM4. + */ + get_stats_counter(ring, REG_A6XX_GMU_ALWAYS_ON_COUNTER_L + 0x1a800, + rbmemptr_stats(ring, index, alwayson_start)); + /* Invalidate CCU depth and color */ OUT_PKT7(ring, CP_EVENT_WRITE, 1); OUT_RING(ring, PC_CCU_INVALIDATE_DEPTH); @@ -98,6 +119,11 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit, } } + get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO, + rbmemptr_stats(ring, index, cpcycles_end)); + get_stats_counter(ring, REG_A6XX_GMU_ALWAYS_ON_COUNTER_L + 0x1a800, + rbmemptr_stats(ring, index, alwayson_end)); + /* Write the fence to the scratch register */ OUT_PKT4(ring, REG_A6XX_CP_SCRATCH_REG(2), 1); OUT_RING(ring, submit->seqno); @@ -387,14 +413,6 @@ static int a6xx_hw_init(struct msm_gpu *gpu) /* Select CP0 to always count cycles */ gpu_write(gpu, REG_A6XX_CP_PERFCTR_CP_SEL_0, PERF_CP_ALWAYS_COUNT); - /* FIXME: not sure if this should live here or in a6xx_gmu.c */ - gmu_write(&a6xx_gpu->gmu, REG_A6XX_GPU_GMU_AO_GPU_CX_BUSY_MASK, - 0xff000000); - gmu_rmw(&a6xx_gpu->gmu, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_SELECT_0, - 0xff, 0x20); - gmu_write(&a6xx_gpu->gmu, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_ENABLE, - 0x01); - gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL, 2 << 1); gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, 2 << 1); gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, 2 << 1); diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h index cffce094aecb..6434ebb13136 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.h +++ b/drivers/gpu/drm/msm/msm_ringbuffer.h @@ -23,9 +23,25 @@ #define rbmemptr(ring, member) \ ((ring)->memptrs_iova + offsetof(struct msm_rbmemptrs, member)) +#define rbmemptr_stats(ring, index, member) \ + (rbmemptr((ring), stats) + \ + ((index) * sizeof(struct msm_gpu_submit_stats)) + \ + offsetof(struct msm_gpu_submit_stats, member)) + +struct msm_gpu_submit_stats { + u64 cpcycles_start; + u64 cpcycles_end; + u64 alwayson_start; + u64 alwayson_end; +}; + +#define MSM_GPU_SUBMIT_STATS_COUNT 64 + struct msm_rbmemptrs { volatile uint32_t rptr; volatile uint32_t fence; + + volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT]; }; struct msm_ringbuffer { From patchwork Fri Nov 2 15:25:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jordan Crouse X-Patchwork-Id: 10665663 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 395A1109C for ; Fri, 2 Nov 2018 15:25:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 27C9C28D72 for ; Fri, 2 Nov 2018 15:25:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1C27F2B5BF; Fri, 2 Nov 2018 15:25:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 61685288A9 for ; Fri, 2 Nov 2018 15:25:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B113189668; Fri, 2 Nov 2018 15:25:39 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6463A894EB; Fri, 2 Nov 2018 15:25:36 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 4637E6130A; Fri, 2 Nov 2018 15:25:35 +0000 (UTC) Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id C78F5612F2; Fri, 2 Nov 2018 15:25:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org C78F5612F2 From: Jordan Crouse To: freedreno@lists.freedesktop.org Subject: [PATCH 05/10] drm/msm/gpu: Add trace events for tracking GPU submissions Date: Fri, 2 Nov 2018 09:25:21 -0600 Message-Id: <20181102152526.23854-6-jcrouse@codeaurora.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20181102152526.23854-1-jcrouse@codeaurora.org> References: <20181102152526.23854-1-jcrouse@codeaurora.org> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-msm@vger.kernel.org, hoegsberg@chromium.org, smasetty@codeaurora.org, dri-devel@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Add trace events to track the progress of a GPU submission msm_gpu_submit occurs at the beginning of the submissions, msm_gpu_submit_flush happens when the submission is put on the ringbuffer and msm_submit_flush_retired is sent when the operation is retired. To make it easier to track the operations a unique sequence number is assigned to each submission and displayed in each event output so a human or a script can easily associate the events related to a specific submission. Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/Makefile | 3 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 7 ++ drivers/gpu/drm/msm/msm_gem.h | 1 + drivers/gpu/drm/msm/msm_gem_submit.c | 15 +++- drivers/gpu/drm/msm/msm_gpu.c | 23 +++++- drivers/gpu/drm/msm/msm_gpu_trace.h | 90 +++++++++++++++++++++++ drivers/gpu/drm/msm/msm_gpu_tracepoints.c | 6 ++ 7 files changed, 139 insertions(+), 6 deletions(-) create mode 100644 drivers/gpu/drm/msm/msm_gpu_trace.h create mode 100644 drivers/gpu/drm/msm/msm_gpu_tracepoints.c diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile index 19ab521d4c3a..5de2d8f0a7e5 100644 --- a/drivers/gpu/drm/msm/Makefile +++ b/drivers/gpu/drm/msm/Makefile @@ -90,7 +90,8 @@ msm-y := \ msm_perf.o \ msm_rd.o \ msm_ringbuffer.o \ - msm_submitqueue.o + msm_submitqueue.o \ + msm_gpu_tracepoints.o msm-$(CONFIG_DEBUG_FS) += adreno/a5xx_debugfs.o \ disp/dpu1/dpu_dbg.o diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index cf66edfb5246..e0a918e8e969 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -4,6 +4,7 @@ #include "msm_gem.h" #include "msm_mmu.h" +#include "msm_gpu_trace.h" #include "a6xx_gpu.h" #include "a6xx_gmu.xml.h" @@ -81,6 +82,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit, { unsigned int index = submit->seqno % MSM_GPU_SUBMIT_STATS_COUNT; struct msm_drm_private *priv = gpu->dev->dev_private; + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); struct msm_ringbuffer *ring = submit->ring; unsigned int i; @@ -138,6 +141,10 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit, OUT_RING(ring, upper_32_bits(rbmemptr(ring, fence))); OUT_RING(ring, submit->seqno); + trace_msm_gpu_submit_flush(submit, + gmu_read64(&a6xx_gpu->gmu, REG_A6XX_GMU_ALWAYS_ON_COUNTER_L, + REG_A6XX_GMU_ALWAYS_ON_COUNTER_H)); + a6xx_flush(gpu, ring); } diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index c5d9bd3e47a8..ddaf8663dc95 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -150,6 +150,7 @@ struct msm_gem_submit { struct msm_ringbuffer *ring; unsigned int nr_cmds; unsigned int nr_bos; + u32 ident; /* A "identifier" for the submit for logging */ struct { uint32_t type; uint32_t size; /* in dwords */ diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 66673ea9bf6f..1b05b8481c31 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -20,6 +20,7 @@ #include "msm_drv.h" #include "msm_gpu.h" #include "msm_gem.h" +#include "msm_gpu_trace.h" /* * Cmdstream submission: @@ -48,7 +49,6 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, submit->dev = dev; submit->gpu = gpu; submit->fence = NULL; - submit->pid = get_pid(task_pid(current)); submit->cmd = (void *)&submit->bos[nr_bos]; submit->queue = queue; submit->ring = gpu->rb[queue->prio]; @@ -408,6 +408,7 @@ static void submit_cleanup(struct msm_gem_submit *submit) int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct drm_file *file) { + static atomic_t ident = ATOMIC_INIT(0); struct msm_drm_private *priv = dev->dev_private; struct drm_msm_gem_submit *args = data; struct msm_file_private *ctx = file->driver_priv; @@ -418,9 +419,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct msm_gpu_submitqueue *queue; struct msm_ringbuffer *ring; int out_fence_fd = -1; + struct pid *pid = get_pid(task_pid(current)); unsigned i; - int ret; - + int ret, submitid; if (!gpu) return -ENXIO; @@ -443,7 +444,12 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (!queue) return -ENOENT; + /* Get a unique identifier for the submission for logging purposes */ + submitid = atomic_inc_return(&ident) - 1; + ring = gpu->rb[queue->prio]; + trace_msm_gpu_submit(pid_nr(pid), ring->id, submitid, + args->nr_bos, args->nr_cmds); if (args->flags & MSM_SUBMIT_FENCE_FD_IN) { in_fence = sync_file_get_fence(args->fence_fd); @@ -480,6 +486,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, goto out_unlock; } + submit->pid = pid; + submit->ident = submitid; + if (args->flags & MSM_SUBMIT_SUDO) submit->in_rb = true; diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index b7b4539c1731..4dcc759746df 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -19,6 +19,7 @@ #include "msm_gem.h" #include "msm_mmu.h" #include "msm_fence.h" +#include "msm_gpu_trace.h" #include #include @@ -662,10 +663,28 @@ int msm_gpu_perfcntr_sample(struct msm_gpu *gpu, uint32_t *activetime, * Cmdstream submission/retirement: */ -static void retire_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) +static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring, + struct msm_gem_submit *submit) { + int index = submit->seqno % MSM_GPU_SUBMIT_STATS_COUNT; + volatile struct msm_gpu_submit_stats *stats; + u64 elapsed, clock = 0; int i; + stats = &ring->memptrs->stats[index]; + /* Convert 19.2Mhz alwayson ticks to nanoseconds for elapsed time */ + elapsed = (stats->alwayson_end - stats->alwayson_start) * 10000; + do_div(elapsed, 192); + + /* Calculate the clock frequency from the number of CP cycles */ + if (elapsed) { + clock = (stats->cpcycles_end - stats->cpcycles_start) * 1000; + do_div(clock, elapsed); + } + + trace_msm_gpu_submit_retired(submit, elapsed, clock, + stats->alwayson_start, stats->alwayson_end); + for (i = 0; i < submit->nr_bos; i++) { struct msm_gem_object *msm_obj = submit->bos[i].obj; /* move to inactive: */ @@ -693,7 +712,7 @@ static void retire_submits(struct msm_gpu *gpu) list_for_each_entry_safe(submit, tmp, &ring->submits, node) { if (dma_fence_is_signaled(submit->fence)) - retire_submit(gpu, submit); + retire_submit(gpu, ring, submit); } } } diff --git a/drivers/gpu/drm/msm/msm_gpu_trace.h b/drivers/gpu/drm/msm/msm_gpu_trace.h new file mode 100644 index 000000000000..5f310225043c --- /dev/null +++ b/drivers/gpu/drm/msm/msm_gpu_trace.h @@ -0,0 +1,90 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#if !defined(_MSM_GPU_TRACE_H_) || defined(TRACE_HEADER_MULTI_READ) +#define _MSM_GPU_TRACE_H_ + +#include + +#undef TRACE_SYSTEM +#define TRACE_SYSTEM drm_msm +#define TRACE_INCLUDE_FILE msm_gpu_trace + +TRACE_EVENT(msm_gpu_submit, + TP_PROTO(pid_t pid, u32 ringid, u32 id, u32 nr_bos, u32 nr_cmds), + TP_ARGS(pid, ringid, id, nr_bos, nr_cmds), + TP_STRUCT__entry( + __field(pid_t, pid) + __field(u32, id) + __field(u32, ringid) + __field(u32, nr_cmds) + __field(u32, nr_bos) + ), + TP_fast_assign( + __entry->pid = pid; + __entry->id = id; + __entry->ringid = ringid; + __entry->nr_bos = nr_bos; + __entry->nr_cmds = nr_cmds + ), + TP_printk("id=%d pid=%d ring=%d bos=%d cmds=%d", + __entry->id, __entry->pid, __entry->ringid, + __entry->nr_bos, __entry->nr_cmds) +); + +TRACE_EVENT(msm_gpu_submit_flush, + TP_PROTO(struct msm_gem_submit *submit, u64 ticks), + TP_ARGS(submit, ticks), + TP_STRUCT__entry( + __field(pid_t, pid) + __field(u32, id) + __field(u32, ringid) + __field(u32, seqno) + __field(u64, ticks) + ), + TP_fast_assign( + __entry->pid = pid_nr(submit->pid); + __entry->id = submit->ident; + __entry->ringid = submit->ring->id; + __entry->seqno = submit->seqno; + __entry->ticks = ticks; + ), + TP_printk("id=%d pid=%d ring=%d:%d ticks=%lld", + __entry->id, __entry->pid, __entry->ringid, __entry->seqno, + __entry->ticks) +); + + +TRACE_EVENT(msm_gpu_submit_retired, + TP_PROTO(struct msm_gem_submit *submit, u64 elapsed, u64 clock, + u64 start, u64 end), + TP_ARGS(submit, elapsed, clock, start, end), + TP_STRUCT__entry( + __field(pid_t, pid) + __field(u32, id) + __field(u32, ringid) + __field(u32, seqno) + __field(u64, elapsed) + __field(u64, clock) + __field(u64, start_ticks) + __field(u64, end_ticks) + ), + TP_fast_assign( + __entry->pid = pid_nr(submit->pid); + __entry->id = submit->ident; + __entry->ringid = submit->ring->id; + __entry->seqno = submit->seqno; + __entry->elapsed = elapsed; + __entry->clock = clock; + __entry->start_ticks = start; + __entry->end_ticks = end; + ), + TP_printk("id=%d pid=%d ring=%d:%d elapsed=%d ns mhz=%lld start=%lld end=%lld", + __entry->id, __entry->pid, __entry->ringid, __entry->seqno, + __entry->elapsed, __entry->clock, + __entry->start_ticks, __entry->end_ticks) +); + +#endif + +#undef TRACE_INCLUDE_PATH +#define TRACE_INCLUDE_PATH ../../drivers/gpu/drm/msm +#include diff --git a/drivers/gpu/drm/msm/msm_gpu_tracepoints.c b/drivers/gpu/drm/msm/msm_gpu_tracepoints.c new file mode 100644 index 000000000000..72c074f8c4f8 --- /dev/null +++ b/drivers/gpu/drm/msm/msm_gpu_tracepoints.c @@ -0,0 +1,6 @@ +// SPDX-License-Identifier: GPL-2.0 +#include "msm_gem.h" +#include "msm_ringbuffer.h" + +#define CREATE_TRACE_POINTS +#include "msm_gpu_trace.h" From patchwork Fri Nov 2 15:25:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jordan Crouse X-Patchwork-Id: 10665661 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 31C2F13BF for ; Fri, 2 Nov 2018 15:25:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 20C3E288A9 for ; Fri, 2 Nov 2018 15:25:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 14A8729015; Fri, 2 Nov 2018 15:25:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id BF1FC288A9 for ; Fri, 2 Nov 2018 15:25:45 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0691789622; Fri, 2 Nov 2018 15:25:39 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4658D89622; Fri, 2 Nov 2018 15:25:37 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 2D720612EE; Fri, 2 Nov 2018 15:25:35 +0000 (UTC) Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 94BE4610DA; Fri, 2 Nov 2018 15:25:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 94BE4610DA From: Jordan Crouse To: freedreno@lists.freedesktop.org Subject: [PATCH 06/10] drm/msm/gpu: Only store local command buffers in the GPU state Date: Fri, 2 Nov 2018 09:25:22 -0600 Message-Id: <20181102152526.23854-7-jcrouse@codeaurora.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20181102152526.23854-1-jcrouse@codeaurora.org> References: <20181102152526.23854-1-jcrouse@codeaurora.org> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-msm@vger.kernel.org, hoegsberg@chromium.org, smasetty@codeaurora.org, dri-devel@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Instead of trying to store all the tagged buffers from a hanging submit only store the command buffers that were not imported. This cuts down on the amount of data stored in the GPU state to the base minimum of useful information. The downside is that this will make it more difficult to successfully replay a hang with just the GPU state but there isn't any reason why that functionality can't be added back in later once we've figured out how to better communicate such massive amounts of data. Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/msm_gpu.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 4dcc759746df..f0985a75f441 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -316,28 +316,28 @@ static void msm_gpu_crashstate_get_bo(struct msm_gpu_state *state, struct msm_gpu_state_bo *state_bo = &state->bos[state->nr_bos]; /* Don't record write only objects */ - state_bo->size = obj->base.size; state_bo->iova = iova; - /* Only store the data for buffer objects marked for read */ - if ((flags & MSM_SUBMIT_BO_READ)) { + /* Only store data for non imported buffer objects marked for read */ + if ((flags & MSM_SUBMIT_BO_READ) && !obj->base.import_attach) { void *ptr; state_bo->data = kvmalloc(obj->base.size, GFP_KERNEL); if (!state_bo->data) - return; + goto out; ptr = msm_gem_get_vaddr_active(&obj->base); if (IS_ERR(ptr)) { kvfree(state_bo->data); - return; + state_bo->data = NULL; + goto out; } memcpy(state_bo->data, ptr, obj->base.size); msm_gem_put_vaddr(&obj->base); } - +out: state->nr_bos++; } @@ -365,12 +365,15 @@ static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, if (submit) { int i; - state->bos = kcalloc(submit->nr_bos, + state->bos = kcalloc(submit->nr_cmds, sizeof(struct msm_gpu_state_bo), GFP_KERNEL); - for (i = 0; state->bos && i < submit->nr_bos; i++) - msm_gpu_crashstate_get_bo(state, submit->bos[i].obj, - submit->bos[i].iova, submit->bos[i].flags); + for (i = 0; state->bos && i < submit->nr_cmds; i++) { + int idx = submit->cmd[i].idx; + + msm_gpu_crashstate_get_bo(state, submit->bos[idx].obj, + submit->bos[idx].iova, submit->bos[idx].flags); + } } /* Set the active crash state to be dumped on failure */ From patchwork Fri Nov 2 15:25:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jordan Crouse X-Patchwork-Id: 10665669 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4AC80109C for ; Fri, 2 Nov 2018 15:25:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3C00D288A9 for ; Fri, 2 Nov 2018 15:25:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 303212B5BF; Fri, 2 Nov 2018 15:25:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E3240288A9 for ; Fri, 2 Nov 2018 15:25:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CAAAA89781; Fri, 2 Nov 2018 15:25:42 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0FCEB89668; Fri, 2 Nov 2018 15:25:39 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1000) id B3EA060881; Fri, 2 Nov 2018 15:25:36 +0000 (UTC) Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 7034D612F3; Fri, 2 Nov 2018 15:25:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 7034D612F3 From: Jordan Crouse To: freedreno@lists.freedesktop.org Subject: [PATCH 07/10] drm/msm/gpu: Move gpu_poll_timeout() to adreno_gpu.h Date: Fri, 2 Nov 2018 09:25:23 -0600 Message-Id: <20181102152526.23854-8-jcrouse@codeaurora.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20181102152526.23854-1-jcrouse@codeaurora.org> References: <20181102152526.23854-1-jcrouse@codeaurora.org> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-msm@vger.kernel.org, hoegsberg@chromium.org, smasetty@codeaurora.org, dri-devel@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP The gpu_poll_timeout() function can be useful to multiple targets so mvoe it into adreno_gpu.h from the a5xx code. Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 5 ----- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 6 ++++++ 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 2d34643ef2e2..fcb8e99563bc 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -20,7 +20,6 @@ #include #include #include -#include #include #include "msm_gem.h" #include "msm_mmu.h" @@ -1211,10 +1210,6 @@ struct a5xx_gpu_state { u32 *hlsqregs; }; -#define gpu_poll_timeout(gpu, addr, val, cond, interval, timeout) \ - readl_poll_timeout((gpu)->mmio + ((addr) << 2), val, cond, \ - interval, timeout) - static int a5xx_crashdumper_init(struct msm_gpu *gpu, struct a5xx_crashdumper *dumper) { diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h index de6e6ee42fba..7e5f1120ce7a 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h @@ -21,6 +21,7 @@ #define __ADRENO_GPU_H__ #include +#include #include "msm_gpu.h" @@ -375,4 +376,9 @@ static inline uint32_t get_wptr(struct msm_ringbuffer *ring) ((1 << 29) \ ((ilog2((_len)) & 0x1F) << 24) | (((_reg) << 2) & 0xFFFFF)) + +#define gpu_poll_timeout(gpu, addr, val, cond, interval, timeout) \ + readl_poll_timeout((gpu)->mmio + ((addr) << 2), val, cond, \ + interval, timeout) + #endif /* __ADRENO_GPU_H__ */ From patchwork Fri Nov 2 15:25:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jordan Crouse X-Patchwork-Id: 10665667 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E3286109C for ; Fri, 2 Nov 2018 15:25:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D23BC288A9 for ; Fri, 2 Nov 2018 15:25:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C69DA2B5BF; Fri, 2 Nov 2018 15:25:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7FADA288A9 for ; Fri, 2 Nov 2018 15:25:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A682189708; Fri, 2 Nov 2018 15:25:42 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by gabe.freedesktop.org (Postfix) with ESMTPS id CA90D8967B; Fri, 2 Nov 2018 15:25:40 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 5CE8260A34; Fri, 2 Nov 2018 15:25:37 +0000 (UTC) Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 5455E6085C; Fri, 2 Nov 2018 15:25:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 5455E6085C From: Jordan Crouse To: freedreno@lists.freedesktop.org Subject: [PATCH 08/10] drm/msm/adreno: Don't capture register values if target doesn't define them Date: Fri, 2 Nov 2018 09:25:24 -0600 Message-Id: <20181102152526.23854-9-jcrouse@codeaurora.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20181102152526.23854-1-jcrouse@codeaurora.org> References: <20181102152526.23854-1-jcrouse@codeaurora.org> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-msm@vger.kernel.org, hoegsberg@chromium.org, smasetty@codeaurora.org, dri-devel@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP If the GPU target doesn't define a list of registers then gracefully skip capturing and/or printing them. This is used by more complex targets like 6xx that have other means of capturing register values. Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 141062fbb4c8..2c9bbad9c943 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -414,6 +414,10 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct msm_gpu_state *state) } } + /* Some targets prefer to collect their own registers */ + if (!adreno_gpu->registers) + return 0; + /* Count the number of registers */ for (i = 0; adreno_gpu->registers[i] != ~0; i += 2) count += adreno_gpu->registers[i + 1] - @@ -551,12 +555,14 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state *state, } } - drm_puts(p, "registers:\n"); + if (state->nr_registers) { + drm_puts(p, "registers:\n"); - for (i = 0; i < state->nr_registers; i++) { - drm_printf(p, " - { offset: 0x%04x, value: 0x%08x }\n", - state->registers[i * 2] << 2, - state->registers[(i * 2) + 1]); + for (i = 0; i < state->nr_registers; i++) { + drm_printf(p, " - { offset: 0x%04x, value: 0x%08x }\n", + state->registers[i * 2] << 2, + state->registers[(i * 2) + 1]); + } } } #endif @@ -595,6 +601,9 @@ void adreno_dump(struct msm_gpu *gpu) struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); int i; + if (!adreno_gpu->registers) + return; + /* dump these out in a form that can be parsed by demsm: */ printk("IO:region %s 00000000 00020000\n", gpu->name); for (i = 0; adreno_gpu->registers[i] != ~0; i += 2) { From patchwork Fri Nov 2 15:25:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jordan Crouse X-Patchwork-Id: 10665675 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A6612109C for ; Fri, 2 Nov 2018 15:25:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 922EB288A9 for ; Fri, 2 Nov 2018 15:25:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8572728D72; Fri, 2 Nov 2018 15:25:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C54C92B5BF for ; Fri, 2 Nov 2018 15:25:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B63E36E558; Fri, 2 Nov 2018 15:25:52 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by gabe.freedesktop.org (Postfix) with ESMTPS id C566B6E558; Fri, 2 Nov 2018 15:25:51 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 70E716135B; Fri, 2 Nov 2018 15:25:43 +0000 (UTC) Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 298AB6132B; Fri, 2 Nov 2018 15:25:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 298AB6132B From: Jordan Crouse To: freedreno@lists.freedesktop.org Subject: [PATCH 09/10] drm/msm/a6xx: Add a6xx gpu state Date: Fri, 2 Nov 2018 09:25:25 -0600 Message-Id: <20181102152526.23854-10-jcrouse@codeaurora.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20181102152526.23854-1-jcrouse@codeaurora.org> References: <20181102152526.23854-1-jcrouse@codeaurora.org> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-msm@vger.kernel.org, hoegsberg@chromium.org, smasetty@codeaurora.org, dri-devel@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Add support for gathering and dumping the a6xx GPU state including registers, GMU registers, indexed registers, shader blocks, context clusters and debugbus. v2: Fix bugs discovered by Sharat Masetty Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/Makefile | 1 + drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 25 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 3 + drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 39 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 8 + drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 1159 +++++++++++++++++++ drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h | 430 +++++++ 7 files changed, 1627 insertions(+), 38 deletions(-) create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile index 5de2d8f0a7e5..fabc17bf1a58 100644 --- a/drivers/gpu/drm/msm/Makefile +++ b/drivers/gpu/drm/msm/Makefile @@ -14,6 +14,7 @@ msm-y := \ adreno/a6xx_gpu.o \ adreno/a6xx_gmu.o \ adreno/a6xx_hfi.o \ + adreno/a6xx_gpu_state.o \ hdmi/hdmi.o \ hdmi/hdmi_audio.o \ hdmi/hdmi_bridge.o \ diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index e816947ac7d8..c58e953fefa3 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -51,10 +51,31 @@ static irqreturn_t a6xx_hfi_irq(int irq, void *data) return IRQ_HANDLED; } +bool a6xx_gmu_sptprac_is_on(struct a6xx_gmu *gmu) +{ + u32 val; + + /* This can be called from gpu state code so make sure GMU is valid */ + if (IS_ERR_OR_NULL(gmu->mmio)) + return false; + + val = gmu_read(gmu, REG_A6XX_GMU_SPTPRAC_PWR_CLK_STATUS); + + return !(val & + (A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_SPTPRAC_GDSC_POWER_OFF | + A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_SP_CLOCK_OFF)); +} + /* Check to see if the GX rail is still powered */ -static bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu) +bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu) { - u32 val = gmu_read(gmu, REG_A6XX_GMU_SPTPRAC_PWR_CLK_STATUS); + u32 val; + + /* This can be called from gpu state code so make sure GMU is valid */ + if (IS_ERR_OR_NULL(gmu->mmio)) + return false; + + val = gmu_read(gmu, REG_A6XX_GMU_SPTPRAC_PWR_CLK_STATUS); return !(val & (A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_GDSC_POWER_OFF | diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h index 35f765afae45..c721d9165d8e 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h @@ -164,4 +164,7 @@ void a6xx_hfi_init(struct a6xx_gmu *gmu); int a6xx_hfi_start(struct a6xx_gmu *gmu, int boot_state); void a6xx_hfi_stop(struct a6xx_gmu *gmu); +bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu); +bool a6xx_gmu_sptprac_is_on(struct a6xx_gmu *gmu); + #endif diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index e0a918e8e969..11f0b99f94c8 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -670,33 +670,6 @@ static const u32 a6xx_register_offsets[REG_ADRENO_REGISTER_MAX] = { REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_CNTL, REG_A6XX_CP_RB_CNTL), }; -static const u32 a6xx_registers[] = { - 0x0000, 0x0002, 0x0010, 0x0010, 0x0012, 0x0012, 0x0018, 0x001b, - 0x001e, 0x0032, 0x0038, 0x003c, 0x0042, 0x0042, 0x0044, 0x0044, - 0x0047, 0x0047, 0x0056, 0x0056, 0x00ad, 0x00ae, 0x00b0, 0x00fb, - 0x0100, 0x011d, 0x0200, 0x020d, 0x0210, 0x0213, 0x0218, 0x023d, - 0x0400, 0x04f9, 0x0500, 0x0500, 0x0505, 0x050b, 0x050e, 0x0511, - 0x0533, 0x0533, 0x0540, 0x0555, 0x0800, 0x0808, 0x0810, 0x0813, - 0x0820, 0x0821, 0x0823, 0x0827, 0x0830, 0x0833, 0x0840, 0x0843, - 0x084f, 0x086f, 0x0880, 0x088a, 0x08a0, 0x08ab, 0x08c0, 0x08c4, - 0x08d0, 0x08dd, 0x08f0, 0x08f3, 0x0900, 0x0903, 0x0908, 0x0911, - 0x0928, 0x093e, 0x0942, 0x094d, 0x0980, 0x0984, 0x098d, 0x0996, - 0x0998, 0x099e, 0x09a0, 0x09a6, 0x09a8, 0x09ae, 0x09b0, 0x09b1, - 0x09c2, 0x09c8, 0x0a00, 0x0a03, 0x0c00, 0x0c04, 0x0c06, 0x0c06, - 0x0c10, 0x0cd9, 0x0e00, 0x0e0e, 0x0e10, 0x0e13, 0x0e17, 0x0e19, - 0x0e1c, 0x0e2b, 0x0e30, 0x0e32, 0x0e38, 0x0e39, 0x8600, 0x8601, - 0x8610, 0x861b, 0x8620, 0x8620, 0x8628, 0x862b, 0x8630, 0x8637, - 0x8e01, 0x8e01, 0x8e04, 0x8e05, 0x8e07, 0x8e08, 0x8e0c, 0x8e0c, - 0x8e10, 0x8e1c, 0x8e20, 0x8e25, 0x8e28, 0x8e28, 0x8e2c, 0x8e2f, - 0x8e3b, 0x8e3e, 0x8e40, 0x8e43, 0x8e50, 0x8e5e, 0x8e70, 0x8e77, - 0x9600, 0x9604, 0x9624, 0x9637, 0x9e00, 0x9e01, 0x9e03, 0x9e0e, - 0x9e11, 0x9e16, 0x9e19, 0x9e19, 0x9e1c, 0x9e1c, 0x9e20, 0x9e23, - 0x9e30, 0x9e31, 0x9e34, 0x9e34, 0x9e70, 0x9e72, 0x9e78, 0x9e79, - 0x9e80, 0x9fff, 0xa600, 0xa601, 0xa603, 0xa603, 0xa60a, 0xa60a, - 0xa610, 0xa617, 0xa630, 0xa630, - ~0 -}; - static int a6xx_pm_resume(struct msm_gpu *gpu) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); @@ -749,14 +722,6 @@ static int a6xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value) return 0; } -#if defined(CONFIG_DEBUG_FS) || defined(CONFIG_DEV_COREDUMP) -static void a6xx_show(struct msm_gpu *gpu, struct msm_gpu_state *state, - struct drm_printer *p) -{ - adreno_show(gpu, state, p); -} -#endif - static struct msm_ringbuffer *a6xx_active_ring(struct msm_gpu *gpu) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); @@ -821,6 +786,8 @@ static const struct adreno_gpu_funcs funcs = { .gpu_busy = a6xx_gpu_busy, .gpu_get_freq = a6xx_gmu_get_freq, .gpu_set_freq = a6xx_gmu_set_freq, + .gpu_state_get = a6xx_gpu_state_get, + .gpu_state_put = a6xx_gpu_state_put, }, .get_timestamp = a6xx_get_timestamp, }; @@ -842,7 +809,7 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev) adreno_gpu = &a6xx_gpu->base; gpu = &adreno_gpu->base; - adreno_gpu->registers = a6xx_registers; + adreno_gpu->registers = NULL; adreno_gpu->reg_offsets = a6xx_register_offsets; ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1); diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index 4127dcebc202..528a4cfe07cd 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -56,6 +56,14 @@ void a6xx_gmu_clear_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state); int a6xx_gmu_probe(struct a6xx_gpu *a6xx_gpu, struct device_node *node); void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu); + void a6xx_gmu_set_freq(struct msm_gpu *gpu, unsigned long freq); unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu); + +void a6xx_show(struct msm_gpu *gpu, struct msm_gpu_state *state, + struct drm_printer *p); + +struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu); +int a6xx_gpu_state_put(struct msm_gpu_state *state); + #endif /* __A6XX_GPU_H__ */ diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c new file mode 100644 index 000000000000..20f5b914c6fb --- /dev/null +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c @@ -0,0 +1,1159 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2018 The Linux Foundation. All rights reserved. */ + +#include +#include "msm_gem.h" +#include "a6xx_gpu.h" +#include "a6xx_gmu.h" +#include "a6xx_gpu_state.h" +#include "a6xx_gmu.xml.h" + +struct a6xx_gpu_state_obj { + const void *handle; + u32 *data; +}; + +struct a6xx_gpu_state { + struct msm_gpu_state base; + + struct a6xx_gpu_state_obj *gmu_registers; + int nr_gmu_registers; + + struct a6xx_gpu_state_obj *registers; + int nr_registers; + + struct a6xx_gpu_state_obj *shaders; + int nr_shaders; + + struct a6xx_gpu_state_obj *clusters; + int nr_clusters; + + struct a6xx_gpu_state_obj *dbgahb_clusters; + int nr_dbgahb_clusters; + + struct a6xx_gpu_state_obj *indexed_regs; + int nr_indexed_regs; + + struct a6xx_gpu_state_obj *debugbus; + int nr_debugbus; + + struct a6xx_gpu_state_obj *vbif_debugbus; + + struct a6xx_gpu_state_obj *cx_debugbus; + int nr_cx_debugbus; +}; + +static inline int CRASHDUMP_WRITE(u64 *in, u32 reg, u32 val) +{ + in[0] = val; + in[1] = (((u64) reg) << 44 | (1 << 21) | 1); + + return 2; +} + +static inline int CRASHDUMP_READ(u64 *in, u32 reg, u32 dwords, u64 target) +{ + in[0] = target; + in[1] = (((u64) reg) << 44 | dwords); + + return 2; +} + +static inline int CRASHDUMP_FINI(u64 *in) +{ + in[0] = 0; + in[1] = 0; + + return 2; +} + +struct a6xx_crashdumper { + void *ptr; + struct drm_gem_object *bo; + u64 iova; +}; + +/* + * Allocate 1MB for the crashdumper scratch region - 8k for the script and + * the rest for the data + */ +#define A6XX_CD_DATA_OFFSET 8192 +#define A6XX_CD_DATA_SIZE (SZ_1M - 8192) + +static int a6xx_crashdumper_init(struct msm_gpu *gpu, + struct a6xx_crashdumper *dumper) +{ + dumper->ptr = msm_gem_kernel_new_locked(gpu->dev, + SZ_1M, MSM_BO_UNCACHED, gpu->aspace, + &dumper->bo, &dumper->iova); + + return IS_ERR(dumper->ptr) ? PTR_ERR(dumper->ptr) : 0; +} + +static int a6xx_crashdumper_run(struct msm_gpu *gpu, + struct a6xx_crashdumper *dumper) +{ + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); + u32 val; + int ret; + + if (IS_ERR_OR_NULL(dumper->ptr)) + return -EINVAL; + + if (!a6xx_gmu_sptprac_is_on(&a6xx_gpu->gmu)) + return -EINVAL; + + /* Make sure all pending memory writes are posted */ + wmb(); + + gpu_write64(gpu, REG_A6XX_CP_CRASH_SCRIPT_BASE_LO, + REG_A6XX_CP_CRASH_SCRIPT_BASE_HI, dumper->iova); + + gpu_write(gpu, REG_A6XX_CP_CRASH_DUMP_CNTL, 1); + + ret = gpu_poll_timeout(gpu, REG_A6XX_CP_CRASH_DUMP_STATUS, val, + val & 0x02, 100, 10000); + + gpu_write(gpu, REG_A6XX_CP_CRASH_DUMP_CNTL, 0); + + return ret; +} + +static void a6xx_crashdumper_free(struct msm_gpu *gpu, + struct a6xx_crashdumper *dumper) +{ + msm_gem_put_iova(dumper->bo, gpu->aspace); + msm_gem_put_vaddr(dumper->bo); + + drm_gem_object_unreference(dumper->bo); +} + +/* read a value from the GX debug bus */ +static int debugbus_read(struct msm_gpu *gpu, u32 block, u32 offset, + u32 *data) +{ + u32 reg = A6XX_DBGC_CFG_DBGBUS_SEL_D_PING_INDEX(offset) | + A6XX_DBGC_CFG_DBGBUS_SEL_D_PING_BLK_SEL(block); + + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_SEL_A, reg); + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_SEL_B, reg); + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_SEL_C, reg); + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_SEL_D, reg); + + /* Wait 1 us to make sure the data is flowing */ + udelay(1); + + data[0] = gpu_read(gpu, REG_A6XX_DBGC_CFG_DBGBUS_TRACE_BUF2); + data[1] = gpu_read(gpu, REG_A6XX_DBGC_CFG_DBGBUS_TRACE_BUF1); + + return 2; +} + +#define cxdbg_write(ptr, offset, val) \ + msm_writel((val), (ptr) + ((offset) << 2)) + +#define cxdbg_read(ptr, offset) \ + msm_readl((ptr) + ((offset) << 2)) + +/* read a value from the CX debug bus */ +static int cx_debugbus_read(void *__iomem cxdbg, u32 block, u32 offset, + u32 *data) +{ + u32 reg = A6XX_CX_DBGC_CFG_DBGBUS_SEL_A_PING_INDEX(offset) | + A6XX_CX_DBGC_CFG_DBGBUS_SEL_A_PING_BLK_SEL(block); + + cxdbg_write(cxdbg, REG_A6XX_CX_DBGC_CFG_DBGBUS_SEL_A, reg); + cxdbg_write(cxdbg, REG_A6XX_CX_DBGC_CFG_DBGBUS_SEL_B, reg); + cxdbg_write(cxdbg, REG_A6XX_CX_DBGC_CFG_DBGBUS_SEL_C, reg); + cxdbg_write(cxdbg, REG_A6XX_CX_DBGC_CFG_DBGBUS_SEL_D, reg); + + /* Wait 1 us to make sure the data is flowing */ + udelay(1); + + data[0] = cxdbg_read(cxdbg, REG_A6XX_CX_DBGC_CFG_DBGBUS_TRACE_BUF2); + data[1] = cxdbg_read(cxdbg, REG_A6XX_CX_DBGC_CFG_DBGBUS_TRACE_BUF1); + + return 2; +} + +/* Read a chunk of data from the VBIF debug bus */ +static int vbif_debugbus_read(struct msm_gpu *gpu, u32 ctrl0, u32 ctrl1, + u32 reg, int count, u32 *data) +{ + int i; + + gpu_write(gpu, ctrl0, reg); + + for (i = 0; i < count; i++) { + gpu_write(gpu, ctrl1, i); + data[i] = gpu_read(gpu, REG_A6XX_VBIF_TEST_BUS_OUT); + } + + return count; +} + +#define AXI_ARB_BLOCKS 2 +#define XIN_AXI_BLOCKS 5 +#define XIN_CORE_BLOCKS 4 + +#define VBIF_DEBUGBUS_BLOCK_SIZE \ + ((16 * AXI_ARB_BLOCKS) + \ + (18 * XIN_AXI_BLOCKS) + \ + (12 * XIN_CORE_BLOCKS)) + +static void a6xx_get_vbif_debugbus_block(struct msm_gpu *gpu, + struct a6xx_gpu_state_obj *obj) +{ + u32 clk, *ptr; + int i; + + obj->data = kcalloc(VBIF_DEBUGBUS_BLOCK_SIZE, sizeof(u32), GFP_KERNEL); + obj->handle = NULL; + + /* Get the current clock setting */ + clk = gpu_read(gpu, REG_A6XX_VBIF_CLKON); + + /* Force on the bus so we can read it */ + gpu_write(gpu, REG_A6XX_VBIF_CLKON, + clk | A6XX_VBIF_CLKON_FORCE_ON_TESTBUS); + + /* We will read from BUS2 first, so disable BUS1 */ + gpu_write(gpu, REG_A6XX_VBIF_TEST_BUS1_CTRL0, 0); + + /* Enable the VBIF bus for reading */ + gpu_write(gpu, REG_A6XX_VBIF_TEST_BUS_OUT_CTRL, 1); + + ptr = obj->data; + + for (i = 0; i < AXI_ARB_BLOCKS; i++) + ptr += vbif_debugbus_read(gpu, + REG_A6XX_VBIF_TEST_BUS2_CTRL0, + REG_A6XX_VBIF_TEST_BUS2_CTRL1, + 1 << (i + 16), 16, ptr); + + for (i = 0; i < XIN_AXI_BLOCKS; i++) + ptr += vbif_debugbus_read(gpu, + REG_A6XX_VBIF_TEST_BUS2_CTRL0, + REG_A6XX_VBIF_TEST_BUS2_CTRL1, + 1 << i, 18, ptr); + + /* Stop BUS2 so we can turn on BUS1 */ + gpu_write(gpu, REG_A6XX_VBIF_TEST_BUS2_CTRL0, 0); + + for (i = 0; i < XIN_CORE_BLOCKS; i++) + ptr += vbif_debugbus_read(gpu, + REG_A6XX_VBIF_TEST_BUS1_CTRL0, + REG_A6XX_VBIF_TEST_BUS1_CTRL1, + 1 << i, 12, ptr); + + /* Restore the VBIF clock setting */ + gpu_write(gpu, REG_A6XX_VBIF_CLKON, clk); +} + +static void a6xx_get_debugbus_block(struct msm_gpu *gpu, + const struct a6xx_debugbus_block *block, + struct a6xx_gpu_state_obj *obj) +{ + int i; + u32 *ptr; + + obj->data = kcalloc(block->count, sizeof(u64), GFP_KERNEL); + if (!obj->data) + return; + + obj->handle = block; + + for (ptr = obj->data, i = 0; i < block->count; i++) + ptr += debugbus_read(gpu, block->id, i, ptr); +} + +static void a6xx_get_cx_debugbus_block(void __iomem *cxdbg, + const struct a6xx_debugbus_block *block, + struct a6xx_gpu_state_obj *obj) +{ + int i; + u32 *ptr; + + obj->data = kcalloc(block->count, sizeof(u64), GFP_KERNEL); + if (!obj->data) + return; + + obj->handle = block; + + for (ptr = obj->data, i = 0; i < block->count; i++) + ptr += cx_debugbus_read(cxdbg, block->id, i, ptr); +} + +static void a6xx_get_debugbus(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state) +{ + struct resource *res; + void __iomem *cxdbg = NULL; + + /* Set up the GX debug bus */ + + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_CNTLT, + A6XX_DBGC_CFG_DBGBUS_CNTLT_SEGT(0xf)); + + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_CNTLM, + A6XX_DBGC_CFG_DBGBUS_CNTLM_ENABLE(0xf)); + + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_IVTL_0, 0); + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_IVTL_1, 0); + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_IVTL_2, 0); + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_IVTL_3, 0); + + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_BYTEL_0, 0x76543210); + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_BYTEL_1, 0xFEDCBA98); + + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_MASKL_0, 0); + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_MASKL_1, 0); + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_MASKL_2, 0); + gpu_write(gpu, REG_A6XX_DBGC_CFG_DBGBUS_MASKL_3, 0); + + /* Set up the CX debug bus - it lives elsewhere in the system so do a + * temporary ioremap for the registers + */ + res = platform_get_resource_byname(gpu->pdev, IORESOURCE_MEM, + "cx_dbgc"); + + if (res) + cxdbg = ioremap(res->start, resource_size(res)); + + if (cxdbg) { + cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_CNTLT, + A6XX_DBGC_CFG_DBGBUS_CNTLT_SEGT(0xf)); + + cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_CNTLM, + A6XX_DBGC_CFG_DBGBUS_CNTLM_ENABLE(0xf)); + + cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_IVTL_0, 0); + cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_IVTL_1, 0); + cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_IVTL_2, 0); + cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_IVTL_3, 0); + + cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_BYTEL_0, + 0x76543210); + cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_BYTEL_1, + 0xFEDCBA98); + + cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_MASKL_0, 0); + cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_MASKL_1, 0); + cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_MASKL_2, 0); + cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_MASKL_3, 0); + } + + a6xx_state->debugbus = kcalloc(ARRAY_SIZE(a6xx_debugbus_blocks), + sizeof(*a6xx_state->debugbus), GFP_KERNEL); + + if (a6xx_state->debugbus) { + int i; + + for (i = 0; i < ARRAY_SIZE(a6xx_debugbus_blocks); i++) + a6xx_get_debugbus_block(gpu, + &a6xx_debugbus_blocks[i], + &a6xx_state->debugbus[i]); + + a6xx_state->nr_debugbus = ARRAY_SIZE(a6xx_debugbus_blocks); + } + + a6xx_state->vbif_debugbus = kzalloc(sizeof(*a6xx_state->vbif_debugbus), + GFP_KERNEL); + + if (a6xx_state->vbif_debugbus) + a6xx_get_vbif_debugbus_block(gpu, a6xx_state->vbif_debugbus); + + if (cxdbg) { + a6xx_state->cx_debugbus = + kcalloc(ARRAY_SIZE(a6xx_cx_debugbus_blocks), + sizeof(*a6xx_state->cx_debugbus), GFP_KERNEL); + + if (a6xx_state->cx_debugbus) { + int i; + + for (i = 0; i < ARRAY_SIZE(a6xx_cx_debugbus_blocks); i++) + a6xx_get_cx_debugbus_block(cxdbg, + &a6xx_cx_debugbus_blocks[i], + &a6xx_state->cx_debugbus[i]); + + a6xx_state->nr_cx_debugbus = + ARRAY_SIZE(a6xx_cx_debugbus_blocks); + } + + iounmap(cxdbg); + } +} + +#define RANGE(reg, a) ((reg)[(a) + 1] - (reg)[(a)] + 1) + +/* Read a data cluster from behind the AHB aperture */ +static void a6xx_get_dbgahb_cluster(struct msm_gpu *gpu, + const struct a6xx_dbgahb_cluster *dbgahb, + struct a6xx_gpu_state_obj *obj, + struct a6xx_crashdumper *dumper) +{ + u64 *in = dumper->ptr; + u64 out = dumper->iova + A6XX_CD_DATA_OFFSET; + size_t datasize; + int i, regcount = 0; + + for (i = 0; i < A6XX_NUM_CONTEXTS; i++) { + int j; + + in += CRASHDUMP_WRITE(in, REG_A6XX_HLSQ_DBG_READ_SEL, + (dbgahb->statetype + i * 2) << 8); + + for (j = 0; j < dbgahb->count; j += 2) { + int count = RANGE(dbgahb->registers, j); + u32 offset = REG_A6XX_HLSQ_DBG_AHB_READ_APERTURE + + dbgahb->registers[j] - (dbgahb->base >> 2); + + in += CRASHDUMP_READ(in, offset, count, out); + + out += count * sizeof(u32); + + if (i == 0) + regcount += count; + } + } + + CRASHDUMP_FINI(in); + + datasize = regcount * A6XX_NUM_CONTEXTS * sizeof(u32); + + if (WARN_ON(datasize > A6XX_CD_DATA_SIZE)) + return; + + if (a6xx_crashdumper_run(gpu, dumper)) + return; + + obj->handle = dbgahb; + obj->data = kmemdup(dumper->ptr + A6XX_CD_DATA_OFFSET, + datasize, GFP_KERNEL); +} + +static void a6xx_get_dbgahb_clusters(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, + struct a6xx_crashdumper *dumper) +{ + int i; + + a6xx_state->dbgahb_clusters = kcalloc(ARRAY_SIZE(a6xx_dbgahb_clusters), + sizeof(*a6xx_state->dbgahb_clusters), GFP_KERNEL); + + if (!a6xx_state->dbgahb_clusters) + return; + + a6xx_state->nr_dbgahb_clusters = ARRAY_SIZE(a6xx_dbgahb_clusters); + + for (i = 0; i < ARRAY_SIZE(a6xx_dbgahb_clusters); i++) + a6xx_get_dbgahb_cluster(gpu, &a6xx_dbgahb_clusters[i], + &a6xx_state->dbgahb_clusters[i], dumper); +} + +/* Read a data cluster from the CP aperture with the crashdumper */ +static void a6xx_get_cluster(struct msm_gpu *gpu, + const struct a6xx_cluster *cluster, + struct a6xx_gpu_state_obj *obj, + struct a6xx_crashdumper *dumper) +{ + u64 *in = dumper->ptr; + u64 out = dumper->iova + A6XX_CD_DATA_OFFSET; + size_t datasize; + int i, regcount = 0; + + /* Some clusters need a selector register to be programmed too */ + if (cluster->sel_reg) + in += CRASHDUMP_WRITE(in, cluster->sel_reg, cluster->sel_val); + + for (i = 0; i < A6XX_NUM_CONTEXTS; i++) { + int j; + + in += CRASHDUMP_WRITE(in, REG_A6XX_CP_APERTURE_CNTL_CD, + (cluster->id << 8) | (i << 4) | i); + + for (j = 0; j < cluster->count; j += 2) { + int count = RANGE(cluster->registers, j); + + in += CRASHDUMP_READ(in, cluster->registers[j], + count, out); + + out += count * sizeof(u32); + + if (i == 0) + regcount += count; + } + } + + CRASHDUMP_FINI(in); + + datasize = regcount * A6XX_NUM_CONTEXTS * sizeof(u32); + + if (WARN_ON(datasize > A6XX_CD_DATA_SIZE)) + return; + + if (a6xx_crashdumper_run(gpu, dumper)) + return; + + obj->handle = cluster; + obj->data = kmemdup(dumper->ptr + A6XX_CD_DATA_OFFSET, + datasize, GFP_KERNEL); +} + +static void a6xx_get_clusters(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, + struct a6xx_crashdumper *dumper) +{ + int i; + + a6xx_state->clusters = kcalloc(ARRAY_SIZE(a6xx_clusters), + sizeof(*a6xx_state->clusters), GFP_KERNEL); + + if (!a6xx_state->clusters) + return; + + a6xx_state->nr_clusters = ARRAY_SIZE(a6xx_clusters); + + for (i = 0; i < ARRAY_SIZE(a6xx_clusters); i++) + a6xx_get_cluster(gpu, &a6xx_clusters[i], + &a6xx_state->clusters[i], dumper); +} + +/* Read a shader / debug block from the HLSQ aperture with the crashdumper */ +static void a6xx_get_shader_block(struct msm_gpu *gpu, + const struct a6xx_shader_block *block, + struct a6xx_gpu_state_obj *obj, + struct a6xx_crashdumper *dumper) +{ + u64 *in = dumper->ptr; + size_t datasize = block->size * A6XX_NUM_SHADER_BANKS * sizeof(u32); + int i; + + if (WARN_ON(datasize > A6XX_CD_DATA_SIZE)) + return; + + for (i = 0; i < A6XX_NUM_SHADER_BANKS; i++) { + in += CRASHDUMP_WRITE(in, REG_A6XX_HLSQ_DBG_READ_SEL, + (block->type << 8) | i); + + in += CRASHDUMP_READ(in, REG_A6XX_HLSQ_DBG_AHB_READ_APERTURE, + block->size, dumper->iova + A6XX_CD_DATA_OFFSET); + } + + CRASHDUMP_FINI(in); + + if (a6xx_crashdumper_run(gpu, dumper)) + return; + + obj->handle = block; + obj->data = kmemdup(dumper->ptr + A6XX_CD_DATA_OFFSET, + datasize, GFP_KERNEL); +} + +static void a6xx_get_shaders(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, + struct a6xx_crashdumper *dumper) +{ + int i; + + a6xx_state->shaders = kcalloc(ARRAY_SIZE(a6xx_shader_blocks), + sizeof(*a6xx_state->shaders), GFP_KERNEL); + + if (!a6xx_state->shaders) + return; + + a6xx_state->nr_shaders = ARRAY_SIZE(a6xx_shader_blocks); + + for (i = 0; i < ARRAY_SIZE(a6xx_shader_blocks); i++) + a6xx_get_shader_block(gpu, &a6xx_shader_blocks[i], + &a6xx_state->shaders[i], dumper); +} + +/* Read registers from behind the HLSQ aperture with the crashdumper */ +static void a6xx_get_crashdumper_hlsq_registers(struct msm_gpu *gpu, + const struct a6xx_registers *regs, + struct a6xx_gpu_state_obj *obj, + struct a6xx_crashdumper *dumper) + +{ + u64 *in = dumper->ptr; + u64 out = dumper->iova + A6XX_CD_DATA_OFFSET; + int i, regcount = 0; + + in += CRASHDUMP_WRITE(in, REG_A6XX_HLSQ_DBG_READ_SEL, regs->val1); + + for (i = 0; i < regs->count; i += 2) { + u32 count = RANGE(regs->registers, i); + u32 offset = REG_A6XX_HLSQ_DBG_AHB_READ_APERTURE + + regs->registers[i] - (regs->val0 >> 2); + + in += CRASHDUMP_READ(in, offset, count, out); + + out += count * sizeof(u32); + regcount += count; + } + + CRASHDUMP_FINI(in); + + if (WARN_ON((regcount * sizeof(u32)) > A6XX_CD_DATA_SIZE)) + return; + + if (a6xx_crashdumper_run(gpu, dumper)) + return; + + obj->handle = regs; + obj->data = kmemdup(dumper->ptr + A6XX_CD_DATA_OFFSET, + regcount * sizeof(u32), GFP_KERNEL); +} + +/* Read a block of registers using the crashdumper */ +static void a6xx_get_crashdumper_registers(struct msm_gpu *gpu, + const struct a6xx_registers *regs, + struct a6xx_gpu_state_obj *obj, + struct a6xx_crashdumper *dumper) + +{ + u64 *in = dumper->ptr; + u64 out = dumper->iova + A6XX_CD_DATA_OFFSET; + int i, regcount = 0; + + /* Some blocks might need to program a selector register first */ + if (regs->val0) + in += CRASHDUMP_WRITE(in, regs->val0, regs->val1); + + for (i = 0; i < regs->count; i += 2) { + u32 count = RANGE(regs->registers, i); + + in += CRASHDUMP_READ(in, regs->registers[i], count, out); + + out += count * sizeof(u32); + regcount += count; + } + + CRASHDUMP_FINI(in); + + if (WARN_ON((regcount * sizeof(u32)) > A6XX_CD_DATA_SIZE)) + return; + + if (a6xx_crashdumper_run(gpu, dumper)) + return; + + obj->handle = regs; + obj->data = kmemdup(dumper->ptr + A6XX_CD_DATA_OFFSET, + regcount * sizeof(u32), GFP_KERNEL); +} + +/* Read a block of registers via AHB */ +static void a6xx_get_ahb_gpu_registers(struct msm_gpu *gpu, + const struct a6xx_registers *regs, + struct a6xx_gpu_state_obj *obj) +{ + int i, regcount = 0, index = 0; + + for (i = 0; i < regs->count; i += 2) + regcount += RANGE(regs->registers, i); + + obj->handle = (const void *) regs; + obj->data = kcalloc(regcount, sizeof(u32), GFP_KERNEL); + if (!obj->data) + return; + + for (i = 0; i < regs->count; i += 2) { + u32 count = RANGE(regs->registers, i); + int j; + + for (j = 0; j < count; j++) + obj->data[index++] = gpu_read(gpu, + regs->registers[i] + j); + } +} + +/* Read a block of GMU registers */ +static void _a6xx_get_gmu_registers(struct msm_gpu *gpu, + const struct a6xx_registers *regs, + struct a6xx_gpu_state_obj *obj) +{ + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); + struct a6xx_gmu *gmu = &a6xx_gpu->gmu; + int i, regcount = 0, index = 0; + + for (i = 0; i < regs->count; i += 2) + regcount += RANGE(regs->registers, i); + + obj->handle = (const void *) regs; + obj->data = kcalloc(regcount, sizeof(u32), GFP_KERNEL); + if (!obj->data) + return; + + for (i = 0; i < regs->count; i += 2) { + u32 count = RANGE(regs->registers, i); + int j; + + for (j = 0; j < count; j++) + obj->data[index++] = gmu_read(gmu, + regs->registers[i] + j); + } +} + +static void a6xx_get_gmu_registers(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state) +{ + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); + + a6xx_state->gmu_registers = kcalloc(2, + sizeof(*a6xx_state->gmu_registers), GFP_KERNEL); + + if (!a6xx_state->gmu_registers) + return; + + a6xx_state->nr_gmu_registers = 2; + + /* Get the CX GMU registers from AHB */ + _a6xx_get_gmu_registers(gpu, &a6xx_gmu_reglist[0], + &a6xx_state->gmu_registers[0]); + + if (!a6xx_gmu_gx_is_on(&a6xx_gpu->gmu)) + return; + + /* Set the fence to ALLOW mode so we can access the registers */ + gpu_write(gpu, REG_A6XX_GMU_AO_AHB_FENCE_CTRL, 0); + + _a6xx_get_gmu_registers(gpu, &a6xx_gmu_reglist[1], + &a6xx_state->gmu_registers[1]); +} + +static void a6xx_get_registers(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, + struct a6xx_crashdumper *dumper) +{ + int i, count = ARRAY_SIZE(a6xx_ahb_reglist) + + ARRAY_SIZE(a6xx_reglist) + + ARRAY_SIZE(a6xx_hlsq_reglist); + int index = 0; + + a6xx_state->registers = kcalloc(count, sizeof(*a6xx_state->registers), + GFP_KERNEL); + + if (!a6xx_state->registers) + return; + + a6xx_state->nr_registers = count; + + for (i = 0; i < ARRAY_SIZE(a6xx_ahb_reglist); i++) + a6xx_get_ahb_gpu_registers(gpu, + &a6xx_ahb_reglist[i], + &a6xx_state->registers[index++]); + + for (i = 0; i < ARRAY_SIZE(a6xx_reglist); i++) + a6xx_get_crashdumper_registers(gpu, + &a6xx_reglist[i], + &a6xx_state->registers[index++], + dumper); + + for (i = 0; i < ARRAY_SIZE(a6xx_hlsq_reglist); i++) + a6xx_get_crashdumper_hlsq_registers(gpu, + &a6xx_hlsq_reglist[i], + &a6xx_state->registers[index++], + dumper); +} + +/* Read a block of data from an indexed register pair */ +static void a6xx_get_indexed_regs(struct msm_gpu *gpu, + const struct a6xx_indexed_registers *indexed, + struct a6xx_gpu_state_obj *obj) +{ + int i; + + obj->handle = (const void *) indexed; + obj->data = kcalloc(indexed->count, sizeof(u32), GFP_KERNEL); + if (!obj->data) + return; + + /* All the indexed banks start at address 0 */ + gpu_write(gpu, indexed->addr, 0); + + /* Read the data - each read increments the internal address by 1 */ + for (i = 0; i < indexed->count; i++) + obj->data[i] = gpu_read(gpu, indexed->data); +} + +static void a6xx_get_indexed_registers(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state) +{ + u32 mempool_size; + int count = ARRAY_SIZE(a6xx_indexed_reglist) + 1; + int i; + + a6xx_state->indexed_regs = kcalloc(count, + sizeof(a6xx_state->indexed_regs), GFP_KERNEL); + if (!a6xx_state->indexed_regs) + return; + + for (i = 0; i < ARRAY_SIZE(a6xx_indexed_reglist); i++) + a6xx_get_indexed_regs(gpu, &a6xx_indexed_reglist[i], + &a6xx_state->indexed_regs[i]); + + /* Set the CP mempool size to 0 to stabilize it while dumping */ + mempool_size = gpu_read(gpu, REG_A6XX_CP_MEM_POOL_SIZE); + gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, 0); + + /* Get the contents of the CP mempool */ + a6xx_get_indexed_regs(gpu, &a6xx_cp_mempool_indexed, + &a6xx_state->indexed_regs[i]); + + /* + * Offset 0x2000 in the mempool is the size - copy the saved size over + * so the data is consistent + */ + a6xx_state->indexed_regs[i].data[0x2000] = mempool_size; + + /* Restore the size in the hardware */ + gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, mempool_size); + + a6xx_state->nr_indexed_regs = count; +} + +struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu) +{ + struct a6xx_crashdumper dumper = { 0 }; + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); + struct a6xx_gpu_state *a6xx_state = kzalloc(sizeof(*a6xx_state), + GFP_KERNEL); + + if (!a6xx_state) + return ERR_PTR(-ENOMEM); + + /* Get the generic state from the adreno core */ + adreno_gpu_state_get(gpu, &a6xx_state->base); + + a6xx_get_gmu_registers(gpu, a6xx_state); + + /* If GX isn't on the rest of the data isn't going to be accessible */ + if (!a6xx_gmu_gx_is_on(&a6xx_gpu->gmu)) + return &a6xx_state->base; + + /* Get the banks of indexed registers */ + a6xx_get_indexed_registers(gpu, a6xx_state); + + /* Try to initialize the crashdumper */ + if (!a6xx_crashdumper_init(gpu, &dumper)) { + a6xx_get_registers(gpu, a6xx_state, &dumper); + a6xx_get_shaders(gpu, a6xx_state, &dumper); + a6xx_get_clusters(gpu, a6xx_state, &dumper); + a6xx_get_dbgahb_clusters(gpu, a6xx_state, &dumper); + + a6xx_crashdumper_free(gpu, &dumper); + } + + a6xx_get_debugbus(gpu, a6xx_state); + + return &a6xx_state->base; +} + +void a6xx_gpu_state_destroy(struct kref *kref) +{ + struct msm_gpu_state *state = container_of(kref, + struct msm_gpu_state, ref); + struct a6xx_gpu_state *a6xx_state = container_of(state, + struct a6xx_gpu_state, base); + int i; + + for (i = 0; i < a6xx_state->nr_gmu_registers; i++) + kfree(a6xx_state->gmu_registers[i].data); + + kfree(a6xx_state->gmu_registers); + + for (i = 0; i < a6xx_state->nr_registers; i++) + kfree(a6xx_state->registers[i].data); + + kfree(a6xx_state->registers); + + for (i = 0; i < a6xx_state->nr_shaders; i++) + kfree(a6xx_state->shaders[i].data); + + kfree(a6xx_state->shaders); + + for (i = 0; i < a6xx_state->nr_clusters; i++) + kfree(a6xx_state->clusters[i].data); + + kfree(a6xx_state->clusters); + + for (i = 0; i < a6xx_state->nr_dbgahb_clusters; i++) + kfree(a6xx_state->dbgahb_clusters[i].data); + + kfree(a6xx_state->dbgahb_clusters); + + for (i = 0; i < a6xx_state->nr_indexed_regs; i++) + kfree(a6xx_state->indexed_regs[i].data); + + kfree(a6xx_state->indexed_regs); + + for (i = 0; i < a6xx_state->nr_debugbus; i++) + kfree(a6xx_state->debugbus[i].data); + + kfree(a6xx_state->debugbus); + + if (a6xx_state->vbif_debugbus) + kfree(a6xx_state->vbif_debugbus->data); + + kfree(a6xx_state->vbif_debugbus); + + for (i = 0; i < a6xx_state->nr_cx_debugbus; i++) + kfree(a6xx_state->cx_debugbus[i].data); + + kfree(a6xx_state->cx_debugbus); + + adreno_gpu_state_destroy(state); + kfree(a6xx_state); +} + +int a6xx_gpu_state_put(struct msm_gpu_state *state) +{ + if (IS_ERR_OR_NULL(state)) + return 1; + + return kref_put(&state->ref, a6xx_gpu_state_destroy); +} + +static void a6xx_show_registers(const u32 *registers, u32 *data, size_t count, + struct drm_printer *p) +{ + int i, index = 0; + + if (!data) + return; + + for (i = 0; i < count; i += 2) { + u32 count = RANGE(registers, i); + u32 offset = registers[i]; + int j; + + for (j = 0; j < count; index++, offset++, j++) { + if (data[index] == 0xdeafbead) + continue; + + drm_printf(p, " - { offset: 0x%06x, value: 0x%08x }\n", + offset << 2, data[index]); + } + } +} + +static void print_ascii85(struct drm_printer *p, size_t len, u32 *data) +{ + char out[ASCII85_BUFSZ]; + long i, l, datalen = 0; + + for (i = 0; i < len >> 2; i++) { + if (data[i]) + datalen = (i + 1) << 2; + } + + if (datalen == 0) + return; + + drm_puts(p, " data: !!ascii85 |\n"); + drm_puts(p, " "); + + + l = ascii85_encode_len(datalen); + + for (i = 0; i < l; i++) + drm_puts(p, ascii85_encode(data[i], out)); + + drm_puts(p, "\n"); +} + +static void print_name(struct drm_printer *p, const char *fmt, const char *name) +{ + drm_puts(p, fmt); + drm_puts(p, name); + drm_puts(p, "\n"); +} + +static void a6xx_show_shader(struct a6xx_gpu_state_obj *obj, + struct drm_printer *p) +{ + const struct a6xx_shader_block *block = obj->handle; + int i; + + if (!obj->handle) + return; + + print_name(p, " - type: ", block->name); + + for (i = 0; i < A6XX_NUM_SHADER_BANKS; i++) { + drm_printf(p, " - bank: %d\n", i); + drm_printf(p, " size: %d\n", block->size); + + if (!obj->data) + continue; + + print_ascii85(p, block->size << 2, + obj->data + (block->size * i)); + } +} + +static void a6xx_show_cluster_data(const u32 *registers, int size, u32 *data, + struct drm_printer *p) +{ + int ctx, index = 0; + + for (ctx = 0; ctx < A6XX_NUM_CONTEXTS; ctx++) { + int j; + + drm_printf(p, " - context: %d\n", ctx); + + for (j = 0; j < size; j += 2) { + u32 count = RANGE(registers, j); + u32 offset = registers[j]; + int k; + + for (k = 0; k < count; index++, offset++, k++) { + if (data[index] == 0xdeafbead) + continue; + + drm_printf(p, " - { offset: 0x%06x, value: 0x%08x }\n", + offset << 2, data[index]); + } + } + } +} + +static void a6xx_show_dbgahb_cluster(struct a6xx_gpu_state_obj *obj, + struct drm_printer *p) +{ + const struct a6xx_dbgahb_cluster *dbgahb = obj->handle; + + if (dbgahb) { + print_name(p, " - cluster-name: ", dbgahb->name); + a6xx_show_cluster_data(dbgahb->registers, dbgahb->count, + obj->data, p); + } +} + +static void a6xx_show_cluster(struct a6xx_gpu_state_obj *obj, + struct drm_printer *p) +{ + const struct a6xx_cluster *cluster = obj->handle; + + if (cluster) { + print_name(p, " - cluster-name: ", cluster->name); + a6xx_show_cluster_data(cluster->registers, cluster->count, + obj->data, p); + } +} + +static void a6xx_show_indexed_regs(struct a6xx_gpu_state_obj *obj, + struct drm_printer *p) +{ + const struct a6xx_indexed_registers *indexed = obj->handle; + + if (!indexed) + return; + + print_name(p, " - regs-name: ", indexed->name); + drm_printf(p, " dwords: %d\n", indexed->count); + + print_ascii85(p, indexed->count << 2, obj->data); +} + +static void a6xx_show_debugbus_block(const struct a6xx_debugbus_block *block, + u32 *data, struct drm_printer *p) +{ + if (block) { + print_name(p, " - debugbus-block: ", block->name); + + /* + * count for regular debugbus data is in quadwords, + * but print the size in dwords for consistency + */ + drm_printf(p, " count: %d\n", block->count << 1); + + print_ascii85(p, block->count << 3, data); + } +} + +static void a6xx_show_debugbus(struct a6xx_gpu_state *a6xx_state, + struct drm_printer *p) +{ + int i; + + for (i = 0; i < a6xx_state->nr_debugbus; i++) { + struct a6xx_gpu_state_obj *obj = &a6xx_state->debugbus[i]; + + a6xx_show_debugbus_block(obj->handle, obj->data, p); + } + + if (a6xx_state->vbif_debugbus) { + struct a6xx_gpu_state_obj *obj = a6xx_state->vbif_debugbus; + + drm_puts(p, " - debugbus-block: A6XX_DBGBUS_VBIF\n"); + drm_printf(p, " count: %d\n", VBIF_DEBUGBUS_BLOCK_SIZE); + + /* vbif debugbus data is in dwords. Confusing, huh? */ + print_ascii85(p, VBIF_DEBUGBUS_BLOCK_SIZE << 2, obj->data); + } + + for (i = 0; i < a6xx_state->nr_cx_debugbus; i++) { + struct a6xx_gpu_state_obj *obj = &a6xx_state->cx_debugbus[i]; + + a6xx_show_debugbus_block(obj->handle, obj->data, p); + } +} + +void a6xx_show(struct msm_gpu *gpu, struct msm_gpu_state *state, + struct drm_printer *p) +{ + struct a6xx_gpu_state *a6xx_state = container_of(state, + struct a6xx_gpu_state, base); + int i; + + if (IS_ERR_OR_NULL(state)) + return; + + adreno_show(gpu, state, p); + + drm_puts(p, "registers:\n"); + for (i = 0; i < a6xx_state->nr_registers; i++) { + struct a6xx_gpu_state_obj *obj = &a6xx_state->registers[i]; + const struct a6xx_registers *regs = obj->handle; + + if (!obj->handle) + continue; + + a6xx_show_registers(regs->registers, obj->data, regs->count, p); + } + + drm_puts(p, "registers-gmu:\n"); + for (i = 0; i < a6xx_state->nr_gmu_registers; i++) { + struct a6xx_gpu_state_obj *obj = &a6xx_state->gmu_registers[i]; + const struct a6xx_registers *regs = obj->handle; + + if (!obj->handle) + continue; + + a6xx_show_registers(regs->registers, obj->data, regs->count, p); + } + + drm_puts(p, "indexed-registers:\n"); + for (i = 0; i < a6xx_state->nr_indexed_regs; i++) + a6xx_show_indexed_regs(&a6xx_state->indexed_regs[i], p); + + drm_puts(p, "shader-blocks:\n"); + for (i = 0; i < a6xx_state->nr_shaders; i++) + a6xx_show_shader(&a6xx_state->shaders[i], p); + + drm_puts(p, "clusters:\n"); + for (i = 0; i < a6xx_state->nr_clusters; i++) + a6xx_show_cluster(&a6xx_state->clusters[i], p); + + for (i = 0; i < a6xx_state->nr_dbgahb_clusters; i++) + a6xx_show_dbgahb_cluster(&a6xx_state->dbgahb_clusters[i], p); + + drm_puts(p, "debugbus:\n"); + a6xx_show_debugbus(a6xx_state, p); +} diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h new file mode 100644 index 000000000000..68cccfa2870a --- /dev/null +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h @@ -0,0 +1,430 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2018 The Linux Foundation. All rights reserved. */ + +#ifndef _A6XX_CRASH_DUMP_H_ +#define _A6XX_CRASH_DUMP_H_ + +#include "a6xx.xml.h" + +#define A6XX_NUM_CONTEXTS 2 +#define A6XX_NUM_SHADER_BANKS 3 + +static const u32 a6xx_gras_cluster[] = { + 0x8000, 0x8006, 0x8010, 0x8092, 0x8094, 0x809d, 0x80a0, 0x80a6, + 0x80af, 0x80f1, 0x8100, 0x8107, 0x8109, 0x8109, 0x8110, 0x8110, + 0x8400, 0x840b, +}; + +static const u32 a6xx_ps_cluster_rac[] = { + 0x8800, 0x8806, 0x8809, 0x8811, 0x8818, 0x881e, 0x8820, 0x8865, + 0x8870, 0x8879, 0x8880, 0x8889, 0x8890, 0x8891, 0x8898, 0x8898, + 0x88c0, 0x88c1, 0x88d0, 0x88e3, 0x8900, 0x890c, 0x890f, 0x891a, + 0x8c00, 0x8c01, 0x8c08, 0x8c10, 0x8c17, 0x8c1f, 0x8c26, 0x8c33, +}; + +static const u32 a6xx_ps_cluster_rbp[] = { + 0x88f0, 0x88f3, 0x890d, 0x890e, 0x8927, 0x8928, 0x8bf0, 0x8bf1, + 0x8c02, 0x8c07, 0x8c11, 0x8c16, 0x8c20, 0x8c25, +}; + +static const u32 a6xx_ps_cluster[] = { + 0x9200, 0x9216, 0x9218, 0x9236, 0x9300, 0x9306, +}; + +static const u32 a6xx_fe_cluster[] = { + 0x9300, 0x9306, 0x9800, 0x9806, 0x9b00, 0x9b07, 0xa000, 0xa009, + 0xa00e, 0xa0ef, 0xa0f8, 0xa0f8, +}; + +static const u32 a6xx_pc_vs_cluster[] = { + 0x9100, 0x9108, 0x9300, 0x9306, 0x9980, 0x9981, 0x9b00, 0x9b07, +}; + +#define CLUSTER_FE 0 +#define CLUSTER_SP_VS 1 +#define CLUSTER_PC_VS 2 +#define CLUSTER_GRAS 3 +#define CLUSTER_SP_PS 4 +#define CLUSTER_PS 5 + +#define CLUSTER(_id, _reg, _sel_reg, _sel_val) \ + { .id = _id, .name = #_id,\ + .registers = _reg, \ + .count = ARRAY_SIZE(_reg), \ + .sel_reg = _sel_reg, .sel_val = _sel_val } + +static const struct a6xx_cluster { + u32 id; + const char *name; + const u32 *registers; + size_t count; + u32 sel_reg; + u32 sel_val; +} a6xx_clusters[] = { + CLUSTER(CLUSTER_GRAS, a6xx_gras_cluster, 0, 0), + CLUSTER(CLUSTER_PS, a6xx_ps_cluster_rac, REG_A6XX_RB_RB_SUB_BLOCK_SEL_CNTL_CD, 0x0), + CLUSTER(CLUSTER_PS, a6xx_ps_cluster_rbp, REG_A6XX_RB_RB_SUB_BLOCK_SEL_CNTL_CD, 0x9), + CLUSTER(CLUSTER_PS, a6xx_ps_cluster, 0, 0), + CLUSTER(CLUSTER_FE, a6xx_fe_cluster, 0, 0), + CLUSTER(CLUSTER_PC_VS, a6xx_pc_vs_cluster, 0, 0), +}; + +static const u32 a6xx_sp_vs_hlsq_cluster[] = { + 0xb800, 0xb803, 0xb820, 0xb822, +}; + +static const u32 a6xx_sp_vs_sp_cluster[] = { + 0xa800, 0xa824, 0xa830, 0xa83c, 0xa840, 0xa864, 0xa870, 0xa895, + 0xa8a0, 0xa8af, 0xa8c0, 0xa8c3, +}; + +static const u32 a6xx_hlsq_duplicate_cluster[] = { + 0xbb10, 0xbb11, 0xbb20, 0xbb29, +}; + +static const u32 a6xx_hlsq_2d_duplicate_cluster[] = { + 0xbd80, 0xbd80, +}; + +static const u32 a6xx_sp_duplicate_cluster[] = { + 0xab00, 0xab00, 0xab04, 0xab05, 0xab10, 0xab1b, 0xab20, 0xab20, +}; + +static const u32 a6xx_tp_duplicate_cluster[] = { + 0xb300, 0xb307, 0xb309, 0xb309, 0xb380, 0xb382, +}; + +static const u32 a6xx_sp_ps_hlsq_cluster[] = { + 0xb980, 0xb980, 0xb982, 0xb987, 0xb990, 0xb99b, 0xb9a0, 0xb9a2, + 0xb9c0, 0xb9c9, +}; + +static const u32 a6xx_sp_ps_hlsq_2d_cluster[] = { + 0xbd80, 0xbd80, +}; + +static const u32 a6xx_sp_ps_sp_cluster[] = { + 0xa980, 0xa9a8, 0xa9b0, 0xa9bc, 0xa9d0, 0xa9d3, 0xa9e0, 0xa9f3, + 0xaa00, 0xaa00, 0xaa30, 0xaa31, +}; + +static const u32 a6xx_sp_ps_sp_2d_cluster[] = { + 0xacc0, 0xacc0, +}; + +static const u32 a6xx_sp_ps_tp_cluster[] = { + 0xb180, 0xb183, 0xb190, 0xb191, +}; + +static const u32 a6xx_sp_ps_tp_2d_cluster[] = { + 0xb4c0, 0xb4d1, +}; + +#define CLUSTER_DBGAHB(_id, _base, _type, _reg) \ + { .name = #_id, .statetype = _type, .base = _base, \ + .registers = _reg, .count = ARRAY_SIZE(_reg) } + +static const struct a6xx_dbgahb_cluster { + const char *name; + u32 statetype; + u32 base; + const u32 *registers; + size_t count; +} a6xx_dbgahb_clusters[] = { + CLUSTER_DBGAHB(CLUSTER_SP_VS, 0x0002e000, 0x41, a6xx_sp_vs_hlsq_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_VS, 0x0002a000, 0x21, a6xx_sp_vs_sp_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_VS, 0x0002e000, 0x41, a6xx_hlsq_duplicate_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_VS, 0x0002f000, 0x45, a6xx_hlsq_2d_duplicate_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_VS, 0x0002a000, 0x21, a6xx_sp_duplicate_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_VS, 0x0002c000, 0x1, a6xx_tp_duplicate_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_PS, 0x0002e000, 0x42, a6xx_sp_ps_hlsq_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_PS, 0x0002f000, 0x46, a6xx_sp_ps_hlsq_2d_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_PS, 0x0002a000, 0x22, a6xx_sp_ps_sp_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_PS, 0x0002b000, 0x26, a6xx_sp_ps_sp_2d_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_PS, 0x0002c000, 0x2, a6xx_sp_ps_tp_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_PS, 0x0002d000, 0x6, a6xx_sp_ps_tp_2d_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_PS, 0x0002e000, 0x42, a6xx_hlsq_duplicate_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_PS, 0x0002a000, 0x22, a6xx_sp_duplicate_cluster), + CLUSTER_DBGAHB(CLUSTER_SP_PS, 0x0002c000, 0x2, a6xx_tp_duplicate_cluster), +}; + +static const u32 a6xx_hlsq_registers[] = { + 0xbe00, 0xbe01, 0xbe04, 0xbe05, 0xbe08, 0xbe09, 0xbe10, 0xbe15, + 0xbe20, 0xbe23, +}; + +static const u32 a6xx_sp_registers[] = { + 0xae00, 0xae04, 0xae0c, 0xae0c, 0xae0f, 0xae2b, 0xae30, 0xae32, + 0xae35, 0xae35, 0xae3a, 0xae3f, 0xae50, 0xae52, +}; + +static const u32 a6xx_tp_registers[] = { + 0xb600, 0xb601, 0xb604, 0xb605, 0xb610, 0xb61b, 0xb620, 0xb623, +}; + +struct a6xx_registers { + const u32 *registers; + size_t count; + u32 val0; + u32 val1; +}; + +#define HLSQ_DBG_REGS(_base, _type, _array) \ + { .val0 = _base, .val1 = _type, .registers = _array, \ + .count = ARRAY_SIZE(_array), } + +static const struct a6xx_registers a6xx_hlsq_reglist[] = { + HLSQ_DBG_REGS(0x0002F800, 0x40, a6xx_hlsq_registers), + HLSQ_DBG_REGS(0x0002B800, 0x20, a6xx_sp_registers), + HLSQ_DBG_REGS(0x0002D800, 0x0, a6xx_tp_registers), +}; + +#define SHADER(_type, _size) \ + { .type = _type, .name = #_type, .size = _size } + +static const struct a6xx_shader_block { + const char *name; + u32 type; + u32 size; +} a6xx_shader_blocks[] = { + SHADER(A6XX_TP0_TMO_DATA, 0x200), + SHADER(A6XX_TP0_SMO_DATA, 0x80), + SHADER(A6XX_TP0_MIPMAP_BASE_DATA, 0x3c0), + SHADER(A6XX_TP1_TMO_DATA, 0x200), + SHADER(A6XX_TP1_SMO_DATA, 0x80), + SHADER(A6XX_TP1_MIPMAP_BASE_DATA, 0x3c0), + SHADER(A6XX_SP_INST_DATA, 0x800), + SHADER(A6XX_SP_LB_0_DATA, 0x800), + SHADER(A6XX_SP_LB_1_DATA, 0x800), + SHADER(A6XX_SP_LB_2_DATA, 0x800), + SHADER(A6XX_SP_LB_3_DATA, 0x800), + SHADER(A6XX_SP_LB_4_DATA, 0x800), + SHADER(A6XX_SP_LB_5_DATA, 0x200), + SHADER(A6XX_SP_CB_BINDLESS_DATA, 0x2000), + SHADER(A6XX_SP_CB_LEGACY_DATA, 0x280), + SHADER(A6XX_SP_UAV_DATA, 0x80), + SHADER(A6XX_SP_INST_TAG, 0x80), + SHADER(A6XX_SP_CB_BINDLESS_TAG, 0x80), + SHADER(A6XX_SP_TMO_UMO_TAG, 0x80), + SHADER(A6XX_SP_SMO_TAG, 0x80), + SHADER(A6XX_SP_STATE_DATA, 0x3f), + SHADER(A6XX_HLSQ_CHUNK_CVS_RAM, 0x1c0), + SHADER(A6XX_HLSQ_CHUNK_CPS_RAM, 0x280), + SHADER(A6XX_HLSQ_CHUNK_CVS_RAM_TAG, 0x40), + SHADER(A6XX_HLSQ_CHUNK_CPS_RAM_TAG, 0x40), + SHADER(A6XX_HLSQ_ICB_CVS_CB_BASE_TAG, 0x4), + SHADER(A6XX_HLSQ_ICB_CPS_CB_BASE_TAG, 0x4), + SHADER(A6XX_HLSQ_CVS_MISC_RAM, 0x1c0), + SHADER(A6XX_HLSQ_CPS_MISC_RAM, 0x580), + SHADER(A6XX_HLSQ_INST_RAM, 0x800), + SHADER(A6XX_HLSQ_GFX_CVS_CONST_RAM, 0x800), + SHADER(A6XX_HLSQ_GFX_CPS_CONST_RAM, 0x800), + SHADER(A6XX_HLSQ_CVS_MISC_RAM_TAG, 0x8), + SHADER(A6XX_HLSQ_CPS_MISC_RAM_TAG, 0x4), + SHADER(A6XX_HLSQ_INST_RAM_TAG, 0x80), + SHADER(A6XX_HLSQ_GFX_CVS_CONST_RAM_TAG, 0xc), + SHADER(A6XX_HLSQ_GFX_CPS_CONST_RAM_TAG, 0x10), + SHADER(A6XX_HLSQ_PWR_REST_RAM, 0x28), + SHADER(A6XX_HLSQ_PWR_REST_TAG, 0x14), + SHADER(A6XX_HLSQ_DATAPATH_META, 0x40), + SHADER(A6XX_HLSQ_FRONTEND_META, 0x40), + SHADER(A6XX_HLSQ_INDIRECT_META, 0x40), +}; + +static const u32 a6xx_rb_rac_registers[] = { + 0x8e04, 0x8e05, 0x8e07, 0x8e08, 0x8e10, 0x8e1c, 0x8e20, 0x8e25, + 0x8e28, 0x8e28, 0x8e2c, 0x8e2f, 0x8e50, 0x8e52, +}; + +static const u32 a6xx_rb_rbp_registers[] = { + 0x8e01, 0x8e01, 0x8e0c, 0x8e0c, 0x8e3b, 0x8e3e, 0x8e40, 0x8e43, + 0x8e53, 0x8e5f, 0x8e70, 0x8e77, +}; + +static const u32 a6xx_registers[] = { + /* RBBM */ + 0x0000, 0x0002, 0x0010, 0x0010, 0x0012, 0x0012, 0x0018, 0x001b, + 0x001e, 0x0032, 0x0038, 0x003c, 0x0042, 0x0042, 0x0044, 0x0044, + 0x0047, 0x0047, 0x0056, 0x0056, 0x00ad, 0x00ae, 0x00b0, 0x00fb, + 0x0100, 0x011d, 0x0200, 0x020d, 0x0218, 0x023d, 0x0400, 0x04f9, + 0x0500, 0x0500, 0x0505, 0x050b, 0x050e, 0x0511, 0x0533, 0x0533, + 0x0540, 0x0555, + /* CP */ + 0x0800, 0x0808, 0x0810, 0x0813, 0x0820, 0x0821, 0x0823, 0x0824, + 0x0826, 0x0827, 0x0830, 0x0833, 0x0840, 0x0843, 0x084f, 0x086f, + 0x0880, 0x088a, 0x08a0, 0x08ab, 0x08c0, 0x08c4, 0x08d0, 0x08dd, + 0x08f0, 0x08f3, 0x0900, 0x0903, 0x0908, 0x0911, 0x0928, 0x093e, + 0x0942, 0x094d, 0x0980, 0x0984, 0x098d, 0x0996, 0x0998, 0x099e, + 0x09a0, 0x09a6, 0x09a8, 0x09ae, 0x09b0, 0x09b1, 0x09c2, 0x09c8, + 0x0a00, 0x0a03, + /* VSC */ + 0x0c00, 0x0c04, 0x0c06, 0x0c06, 0x0c10, 0x0cd9, 0x0e00, 0x0e0e, + /* UCHE */ + 0x0e10, 0x0e13, 0x0e17, 0x0e19, 0x0e1c, 0x0e2b, 0x0e30, 0x0e32, + 0x0e38, 0x0e39, + /* GRAS */ + 0x8600, 0x8601, 0x8610, 0x861b, 0x8620, 0x8620, 0x8628, 0x862b, + 0x8630, 0x8637, + /* VPC */ + 0x9600, 0x9604, 0x9624, 0x9637, + /* PC */ + 0x9e00, 0x9e01, 0x9e03, 0x9e0e, 0x9e11, 0x9e16, 0x9e19, 0x9e19, + 0x9e1c, 0x9e1c, 0x9e20, 0x9e23, 0x9e30, 0x9e31, 0x9e34, 0x9e34, + 0x9e70, 0x9e72, 0x9e78, 0x9e79, 0x9e80, 0x9fff, + /* VFD */ + 0xa600, 0xa601, 0xa603, 0xa603, 0xa60a, 0xa60a, 0xa610, 0xa617, + 0xa630, 0xa630, +}; + +#define REGS(_array, _sel_reg, _sel_val) \ + { .registers = _array, .count = ARRAY_SIZE(_array), \ + .val0 = _sel_reg, .val1 = _sel_val } + +static const struct a6xx_registers a6xx_reglist[] = { + REGS(a6xx_registers, 0, 0), + REGS(a6xx_rb_rac_registers, REG_A6XX_RB_RB_SUB_BLOCK_SEL_CNTL_CD, 0), + REGS(a6xx_rb_rbp_registers, REG_A6XX_RB_RB_SUB_BLOCK_SEL_CNTL_CD, 9), +}; + +static const u32 a6xx_ahb_registers[] = { + /* RBBM_STATUS - RBBM_STATUS3 */ + 0x210, 0x213, + /* CP_STATUS_1 */ + 0x825, 0x825, +}; + +static const u32 a6xx_vbif_registers[] = { + 0x3000, 0x3007, 0x300c, 0x3014, 0x3018, 0x302d, 0x3030, 0x3031, + 0x3034, 0x3036, 0x303c, 0x303d, 0x3040, 0x3040, 0x3042, 0x3042, + 0x3049, 0x3049, 0x3058, 0x3058, 0x305a, 0x3061, 0x3064, 0x3068, + 0x306c, 0x306d, 0x3080, 0x3088, 0x308b, 0x308c, 0x3090, 0x3094, + 0x3098, 0x3098, 0x309c, 0x309c, 0x30c0, 0x30c0, 0x30c8, 0x30c8, + 0x30d0, 0x30d0, 0x30d8, 0x30d8, 0x30e0, 0x30e0, 0x3100, 0x3100, + 0x3108, 0x3108, 0x3110, 0x3110, 0x3118, 0x3118, 0x3120, 0x3120, + 0x3124, 0x3125, 0x3129, 0x3129, 0x3131, 0x3131, 0x3154, 0x3154, + 0x3156, 0x3156, 0x3158, 0x3158, 0x315a, 0x315a, 0x315c, 0x315c, + 0x315e, 0x315e, 0x3160, 0x3160, 0x3162, 0x3162, 0x340c, 0x340c, + 0x3410, 0x3410, 0x3800, 0x3801, +}; + +static const struct a6xx_registers a6xx_ahb_reglist[] = { + REGS(a6xx_ahb_registers, 0, 0), + REGS(a6xx_vbif_registers, 0, 0), +}; + +static const u32 a6xx_gmu_gx_registers[] = { + /* GMU GX */ + 0x0000, 0x0000, 0x0010, 0x0013, 0x0016, 0x0016, 0x0018, 0x001b, + 0x001e, 0x001e, 0x0020, 0x0023, 0x0026, 0x0026, 0x0028, 0x002b, + 0x002e, 0x002e, 0x0030, 0x0033, 0x0036, 0x0036, 0x0038, 0x003b, + 0x003e, 0x003e, 0x0040, 0x0043, 0x0046, 0x0046, 0x0080, 0x0084, + 0x0100, 0x012b, 0x0140, 0x0140, +}; + +static const u32 a6xx_gmu_cx_registers[] = { + /* GMU CX */ + 0x4c00, 0x4c07, 0x4c10, 0x4c12, 0x4d00, 0x4d00, 0x4d07, 0x4d0a, + 0x5000, 0x5004, 0x5007, 0x5008, 0x500b, 0x500c, 0x500f, 0x501c, + 0x5024, 0x502a, 0x502d, 0x5030, 0x5040, 0x5053, 0x5087, 0x5089, + 0x50a0, 0x50a2, 0x50a4, 0x50af, 0x50c0, 0x50c3, 0x50d0, 0x50d0, + 0x50e4, 0x50e4, 0x50e8, 0x50ec, 0x5100, 0x5103, 0x5140, 0x5140, + 0x5142, 0x5144, 0x514c, 0x514d, 0x514f, 0x5151, 0x5154, 0x5154, + 0x5157, 0x5158, 0x515d, 0x515d, 0x5162, 0x5162, 0x5164, 0x5165, + 0x5180, 0x5186, 0x5190, 0x519e, 0x51c0, 0x51c0, 0x51c5, 0x51cc, + 0x51e0, 0x51e2, 0x51f0, 0x51f0, 0x5200, 0x5201, + /* GPU RSCC */ + 0x8c8c, 0x8c8c, 0x8d01, 0x8d02, 0x8f40, 0x8f42, 0x8f44, 0x8f47, + 0x8f4c, 0x8f87, 0x8fec, 0x8fef, 0x8ff4, 0x902f, 0x9094, 0x9097, + 0x909c, 0x90d7, 0x913c, 0x913f, 0x9144, 0x917f, + /* GMU AO */ + 0x9300, 0x9316, 0x9400, 0x9400, + /* GPU CC */ + 0x9800, 0x9812, 0x9840, 0x9852, 0x9c00, 0x9c04, 0x9c07, 0x9c0b, + 0x9c15, 0x9c1c, 0x9c1e, 0x9c2d, 0x9c3c, 0x9c3d, 0x9c3f, 0x9c40, + 0x9c42, 0x9c49, 0x9c58, 0x9c5a, 0x9d40, 0x9d5e, 0xa000, 0xa002, + 0xa400, 0xa402, 0xac00, 0xac02, 0xb000, 0xb002, 0xb400, 0xb402, + 0xb800, 0xb802, + /* GPU CC ACD */ + 0xbc00, 0xbc16, 0xbc20, 0xbc27, +}; + +static const struct a6xx_registers a6xx_gmu_reglist[] = { + REGS(a6xx_gmu_cx_registers, 0, 0), + REGS(a6xx_gmu_gx_registers, 0, 0), +}; + +static const struct a6xx_indexed_registers { + const char *name; + u32 addr; + u32 data; + u32 count; +} a6xx_indexed_reglist[] = { + { "CP_SEQ_STAT", REG_A6XX_CP_SQE_STAT_ADDR, + REG_A6XX_CP_SQE_STAT_DATA, 0x33 }, + { "CP_DRAW_STATE", REG_A6XX_CP_DRAW_STATE_ADDR, + REG_A6XX_CP_DRAW_STATE_DATA, 0x100 }, + { "CP_UCODE_DBG_DATA", REG_A6XX_CP_SQE_UCODE_DBG_ADDR, + REG_A6XX_CP_SQE_UCODE_DBG_DATA, 0x6000 }, + { "CP_ROQ", REG_A6XX_CP_ROQ_DBG_ADDR, + REG_A6XX_CP_ROQ_DBG_DATA, 0x400 }, +}; + +static const struct a6xx_indexed_registers a6xx_cp_mempool_indexed = { + "CP_MEMPOOOL", REG_A6XX_CP_MEM_POOL_DBG_ADDR, + REG_A6XX_CP_MEM_POOL_DBG_DATA, 0x2060, +}; + +#define DEBUGBUS(_id, _count) { .id = _id, .name = #_id, .count = _count } + +static const struct a6xx_debugbus_block { + const char *name; + u32 id; + u32 count; +} a6xx_debugbus_blocks[] = { + DEBUGBUS(A6XX_DBGBUS_CP, 0x100), + DEBUGBUS(A6XX_DBGBUS_RBBM, 0x100), + DEBUGBUS(A6XX_DBGBUS_HLSQ, 0x100), + DEBUGBUS(A6XX_DBGBUS_UCHE, 0x100), + DEBUGBUS(A6XX_DBGBUS_DPM, 0x100), + DEBUGBUS(A6XX_DBGBUS_TESS, 0x100), + DEBUGBUS(A6XX_DBGBUS_PC, 0x100), + DEBUGBUS(A6XX_DBGBUS_VFDP, 0x100), + DEBUGBUS(A6XX_DBGBUS_VPC, 0x100), + DEBUGBUS(A6XX_DBGBUS_TSE, 0x100), + DEBUGBUS(A6XX_DBGBUS_RAS, 0x100), + DEBUGBUS(A6XX_DBGBUS_VSC, 0x100), + DEBUGBUS(A6XX_DBGBUS_COM, 0x100), + DEBUGBUS(A6XX_DBGBUS_LRZ, 0x100), + DEBUGBUS(A6XX_DBGBUS_A2D, 0x100), + DEBUGBUS(A6XX_DBGBUS_CCUFCHE, 0x100), + DEBUGBUS(A6XX_DBGBUS_RBP, 0x100), + DEBUGBUS(A6XX_DBGBUS_DCS, 0x100), + DEBUGBUS(A6XX_DBGBUS_DBGC, 0x100), + DEBUGBUS(A6XX_DBGBUS_GMU_GX, 0x100), + DEBUGBUS(A6XX_DBGBUS_TPFCHE, 0x100), + DEBUGBUS(A6XX_DBGBUS_GPC, 0x100), + DEBUGBUS(A6XX_DBGBUS_LARC, 0x100), + DEBUGBUS(A6XX_DBGBUS_HLSQ_SPTP, 0x100), + DEBUGBUS(A6XX_DBGBUS_RB_0, 0x100), + DEBUGBUS(A6XX_DBGBUS_RB_1, 0x100), + DEBUGBUS(A6XX_DBGBUS_UCHE_WRAPPER, 0x100), + DEBUGBUS(A6XX_DBGBUS_CCU_0, 0x100), + DEBUGBUS(A6XX_DBGBUS_CCU_1, 0x100), + DEBUGBUS(A6XX_DBGBUS_VFD_0, 0x100), + DEBUGBUS(A6XX_DBGBUS_VFD_1, 0x100), + DEBUGBUS(A6XX_DBGBUS_VFD_2, 0x100), + DEBUGBUS(A6XX_DBGBUS_VFD_3, 0x100), + DEBUGBUS(A6XX_DBGBUS_SP_0, 0x100), + DEBUGBUS(A6XX_DBGBUS_SP_1, 0x100), + DEBUGBUS(A6XX_DBGBUS_TPL1_0, 0x100), + DEBUGBUS(A6XX_DBGBUS_TPL1_1, 0x100), + DEBUGBUS(A6XX_DBGBUS_TPL1_2, 0x100), + DEBUGBUS(A6XX_DBGBUS_TPL1_3, 0x100), +}; + +static const struct a6xx_debugbus_block a6xx_cx_debugbus_blocks[] = { + DEBUGBUS(A6XX_DBGBUS_GMU_CX, 0x100), + DEBUGBUS(A6XX_DBGBUS_CX, 0x100), +}; + +#endif From patchwork Fri Nov 2 15:25:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jordan Crouse X-Patchwork-Id: 10665677 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1CEB7109C for ; Fri, 2 Nov 2018 15:26:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 095F828D72 for ; Fri, 2 Nov 2018 15:26:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F16522B5BF; Fri, 2 Nov 2018 15:26:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F17AA2B5D7 for ; Fri, 2 Nov 2018 15:26:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D5A846E568; Fri, 2 Nov 2018 15:26:00 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by gabe.freedesktop.org (Postfix) with ESMTPS id 527186E567; Fri, 2 Nov 2018 15:25:59 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1000) id DE74860C64; Fri, 2 Nov 2018 15:25:41 +0000 (UTC) Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 3C65961322; Fri, 2 Nov 2018 15:25:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 3C65961322 From: Jordan Crouse To: freedreno@lists.freedesktop.org Subject: [PATCH 10/10] drm/msm/a6xx: Track and manage a6xx state memory Date: Fri, 2 Nov 2018 09:25:26 -0600 Message-Id: <20181102152526.23854-11-jcrouse@codeaurora.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20181102152526.23854-1-jcrouse@codeaurora.org> References: <20181102152526.23854-1-jcrouse@codeaurora.org> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-msm@vger.kernel.org, hoegsberg@chromium.org, smasetty@codeaurora.org, dri-devel@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP The a6xx GPU state allocates a LOT of memory. Add a bit of infrastructure to track the memory allocations in the GPU structure and delete them when the state is destroyed much the same way that devm works with the device model as a whole. This protects against the developer accidentally forgetting to add a kfree() to an ever growing list. Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 192 +++++++++++--------- 1 file changed, 102 insertions(+), 90 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c index 20f5b914c6fb..ec57ddeb8c77 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c @@ -41,6 +41,8 @@ struct a6xx_gpu_state { struct a6xx_gpu_state_obj *cx_debugbus; int nr_cx_debugbus; + + struct list_head objs; }; static inline int CRASHDUMP_WRITE(u64 *in, u32 reg, u32 val) @@ -73,6 +75,33 @@ struct a6xx_crashdumper { u64 iova; }; +struct a6xx_state_memobj { + struct list_head node; + unsigned long long data[]; +}; + +void *state_kcalloc(struct a6xx_gpu_state *a6xx_state, int nr, size_t objsize) +{ + struct a6xx_state_memobj *obj = + kzalloc((nr * objsize) + sizeof(*obj), GFP_KERNEL); + + if (!obj) + return NULL; + + list_add_tail(&obj->node, &a6xx_state->objs); + return &obj->data; +} + +void *state_kmemdup(struct a6xx_gpu_state *a6xx_state, void *src, + size_t size) +{ + void *dst = state_kcalloc(a6xx_state, 1, size); + + if (dst) + memcpy(dst, src, size); + return dst; +} + /* * Allocate 1MB for the crashdumper scratch region - 8k for the script and * the rest for the data @@ -203,12 +232,17 @@ static int vbif_debugbus_read(struct msm_gpu *gpu, u32 ctrl0, u32 ctrl1, (12 * XIN_CORE_BLOCKS)) static void a6xx_get_vbif_debugbus_block(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, struct a6xx_gpu_state_obj *obj) { u32 clk, *ptr; int i; - obj->data = kcalloc(VBIF_DEBUGBUS_BLOCK_SIZE, sizeof(u32), GFP_KERNEL); + obj->data = state_kcalloc(a6xx_state, VBIF_DEBUGBUS_BLOCK_SIZE, + sizeof(u32)); + if (!obj->data) + return; + obj->handle = NULL; /* Get the current clock setting */ @@ -252,13 +286,14 @@ static void a6xx_get_vbif_debugbus_block(struct msm_gpu *gpu, } static void a6xx_get_debugbus_block(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, const struct a6xx_debugbus_block *block, struct a6xx_gpu_state_obj *obj) { int i; u32 *ptr; - obj->data = kcalloc(block->count, sizeof(u64), GFP_KERNEL); + obj->data = state_kcalloc(a6xx_state, block->count, sizeof(u64)); if (!obj->data) return; @@ -269,13 +304,14 @@ static void a6xx_get_debugbus_block(struct msm_gpu *gpu, } static void a6xx_get_cx_debugbus_block(void __iomem *cxdbg, + struct a6xx_gpu_state *a6xx_state, const struct a6xx_debugbus_block *block, struct a6xx_gpu_state_obj *obj) { int i; u32 *ptr; - obj->data = kcalloc(block->count, sizeof(u64), GFP_KERNEL); + obj->data = state_kcalloc(a6xx_state, block->count, sizeof(u64)); if (!obj->data) return; @@ -344,36 +380,42 @@ static void a6xx_get_debugbus(struct msm_gpu *gpu, cxdbg_write(cxdbg, REG_A6XX_DBGC_CFG_DBGBUS_MASKL_3, 0); } - a6xx_state->debugbus = kcalloc(ARRAY_SIZE(a6xx_debugbus_blocks), - sizeof(*a6xx_state->debugbus), GFP_KERNEL); + a6xx_state->debugbus = state_kcalloc(a6xx_state, + ARRAY_SIZE(a6xx_debugbus_blocks), + sizeof(*a6xx_state->debugbus)); if (a6xx_state->debugbus) { int i; for (i = 0; i < ARRAY_SIZE(a6xx_debugbus_blocks); i++) a6xx_get_debugbus_block(gpu, + a6xx_state, &a6xx_debugbus_blocks[i], &a6xx_state->debugbus[i]); a6xx_state->nr_debugbus = ARRAY_SIZE(a6xx_debugbus_blocks); } - a6xx_state->vbif_debugbus = kzalloc(sizeof(*a6xx_state->vbif_debugbus), - GFP_KERNEL); + a6xx_state->vbif_debugbus = + state_kcalloc(a6xx_state, 1, + sizeof(*a6xx_state->vbif_debugbus)); if (a6xx_state->vbif_debugbus) - a6xx_get_vbif_debugbus_block(gpu, a6xx_state->vbif_debugbus); + a6xx_get_vbif_debugbus_block(gpu, a6xx_state, + a6xx_state->vbif_debugbus); if (cxdbg) { a6xx_state->cx_debugbus = - kcalloc(ARRAY_SIZE(a6xx_cx_debugbus_blocks), - sizeof(*a6xx_state->cx_debugbus), GFP_KERNEL); + state_kcalloc(a6xx_state, + ARRAY_SIZE(a6xx_cx_debugbus_blocks), + sizeof(*a6xx_state->cx_debugbus)); if (a6xx_state->cx_debugbus) { int i; for (i = 0; i < ARRAY_SIZE(a6xx_cx_debugbus_blocks); i++) a6xx_get_cx_debugbus_block(cxdbg, + a6xx_state, &a6xx_cx_debugbus_blocks[i], &a6xx_state->cx_debugbus[i]); @@ -389,6 +431,7 @@ static void a6xx_get_debugbus(struct msm_gpu *gpu, /* Read a data cluster from behind the AHB aperture */ static void a6xx_get_dbgahb_cluster(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, const struct a6xx_dbgahb_cluster *dbgahb, struct a6xx_gpu_state_obj *obj, struct a6xx_crashdumper *dumper) @@ -429,8 +472,8 @@ static void a6xx_get_dbgahb_cluster(struct msm_gpu *gpu, return; obj->handle = dbgahb; - obj->data = kmemdup(dumper->ptr + A6XX_CD_DATA_OFFSET, - datasize, GFP_KERNEL); + obj->data = state_kmemdup(a6xx_state, dumper->ptr + A6XX_CD_DATA_OFFSET, + datasize); } static void a6xx_get_dbgahb_clusters(struct msm_gpu *gpu, @@ -439,8 +482,9 @@ static void a6xx_get_dbgahb_clusters(struct msm_gpu *gpu, { int i; - a6xx_state->dbgahb_clusters = kcalloc(ARRAY_SIZE(a6xx_dbgahb_clusters), - sizeof(*a6xx_state->dbgahb_clusters), GFP_KERNEL); + a6xx_state->dbgahb_clusters = state_kcalloc(a6xx_state, + ARRAY_SIZE(a6xx_dbgahb_clusters), + sizeof(*a6xx_state->dbgahb_clusters)); if (!a6xx_state->dbgahb_clusters) return; @@ -448,12 +492,14 @@ static void a6xx_get_dbgahb_clusters(struct msm_gpu *gpu, a6xx_state->nr_dbgahb_clusters = ARRAY_SIZE(a6xx_dbgahb_clusters); for (i = 0; i < ARRAY_SIZE(a6xx_dbgahb_clusters); i++) - a6xx_get_dbgahb_cluster(gpu, &a6xx_dbgahb_clusters[i], + a6xx_get_dbgahb_cluster(gpu, a6xx_state, + &a6xx_dbgahb_clusters[i], &a6xx_state->dbgahb_clusters[i], dumper); } /* Read a data cluster from the CP aperture with the crashdumper */ static void a6xx_get_cluster(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, const struct a6xx_cluster *cluster, struct a6xx_gpu_state_obj *obj, struct a6xx_crashdumper *dumper) @@ -497,8 +543,8 @@ static void a6xx_get_cluster(struct msm_gpu *gpu, return; obj->handle = cluster; - obj->data = kmemdup(dumper->ptr + A6XX_CD_DATA_OFFSET, - datasize, GFP_KERNEL); + obj->data = state_kmemdup(a6xx_state, dumper->ptr + A6XX_CD_DATA_OFFSET, + datasize); } static void a6xx_get_clusters(struct msm_gpu *gpu, @@ -507,8 +553,8 @@ static void a6xx_get_clusters(struct msm_gpu *gpu, { int i; - a6xx_state->clusters = kcalloc(ARRAY_SIZE(a6xx_clusters), - sizeof(*a6xx_state->clusters), GFP_KERNEL); + a6xx_state->clusters = state_kcalloc(a6xx_state, + ARRAY_SIZE(a6xx_clusters), sizeof(*a6xx_state->clusters)); if (!a6xx_state->clusters) return; @@ -516,12 +562,13 @@ static void a6xx_get_clusters(struct msm_gpu *gpu, a6xx_state->nr_clusters = ARRAY_SIZE(a6xx_clusters); for (i = 0; i < ARRAY_SIZE(a6xx_clusters); i++) - a6xx_get_cluster(gpu, &a6xx_clusters[i], + a6xx_get_cluster(gpu, a6xx_state, &a6xx_clusters[i], &a6xx_state->clusters[i], dumper); } /* Read a shader / debug block from the HLSQ aperture with the crashdumper */ static void a6xx_get_shader_block(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, const struct a6xx_shader_block *block, struct a6xx_gpu_state_obj *obj, struct a6xx_crashdumper *dumper) @@ -547,8 +594,8 @@ static void a6xx_get_shader_block(struct msm_gpu *gpu, return; obj->handle = block; - obj->data = kmemdup(dumper->ptr + A6XX_CD_DATA_OFFSET, - datasize, GFP_KERNEL); + obj->data = state_kmemdup(a6xx_state, dumper->ptr + A6XX_CD_DATA_OFFSET, + datasize); } static void a6xx_get_shaders(struct msm_gpu *gpu, @@ -557,8 +604,8 @@ static void a6xx_get_shaders(struct msm_gpu *gpu, { int i; - a6xx_state->shaders = kcalloc(ARRAY_SIZE(a6xx_shader_blocks), - sizeof(*a6xx_state->shaders), GFP_KERNEL); + a6xx_state->shaders = state_kcalloc(a6xx_state, + ARRAY_SIZE(a6xx_shader_blocks), sizeof(*a6xx_state->shaders)); if (!a6xx_state->shaders) return; @@ -566,12 +613,13 @@ static void a6xx_get_shaders(struct msm_gpu *gpu, a6xx_state->nr_shaders = ARRAY_SIZE(a6xx_shader_blocks); for (i = 0; i < ARRAY_SIZE(a6xx_shader_blocks); i++) - a6xx_get_shader_block(gpu, &a6xx_shader_blocks[i], + a6xx_get_shader_block(gpu, a6xx_state, &a6xx_shader_blocks[i], &a6xx_state->shaders[i], dumper); } /* Read registers from behind the HLSQ aperture with the crashdumper */ static void a6xx_get_crashdumper_hlsq_registers(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, const struct a6xx_registers *regs, struct a6xx_gpu_state_obj *obj, struct a6xx_crashdumper *dumper) @@ -603,12 +651,13 @@ static void a6xx_get_crashdumper_hlsq_registers(struct msm_gpu *gpu, return; obj->handle = regs; - obj->data = kmemdup(dumper->ptr + A6XX_CD_DATA_OFFSET, - regcount * sizeof(u32), GFP_KERNEL); + obj->data = state_kmemdup(a6xx_state, dumper->ptr + A6XX_CD_DATA_OFFSET, + regcount * sizeof(u32)); } /* Read a block of registers using the crashdumper */ static void a6xx_get_crashdumper_registers(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, const struct a6xx_registers *regs, struct a6xx_gpu_state_obj *obj, struct a6xx_crashdumper *dumper) @@ -640,12 +689,13 @@ static void a6xx_get_crashdumper_registers(struct msm_gpu *gpu, return; obj->handle = regs; - obj->data = kmemdup(dumper->ptr + A6XX_CD_DATA_OFFSET, - regcount * sizeof(u32), GFP_KERNEL); + obj->data = state_kmemdup(a6xx_state, dumper->ptr + A6XX_CD_DATA_OFFSET, + regcount * sizeof(u32)); } /* Read a block of registers via AHB */ static void a6xx_get_ahb_gpu_registers(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, const struct a6xx_registers *regs, struct a6xx_gpu_state_obj *obj) { @@ -655,7 +705,7 @@ static void a6xx_get_ahb_gpu_registers(struct msm_gpu *gpu, regcount += RANGE(regs->registers, i); obj->handle = (const void *) regs; - obj->data = kcalloc(regcount, sizeof(u32), GFP_KERNEL); + obj->data = state_kcalloc(a6xx_state, regcount, sizeof(u32)); if (!obj->data) return; @@ -671,6 +721,7 @@ static void a6xx_get_ahb_gpu_registers(struct msm_gpu *gpu, /* Read a block of GMU registers */ static void _a6xx_get_gmu_registers(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, const struct a6xx_registers *regs, struct a6xx_gpu_state_obj *obj) { @@ -683,7 +734,7 @@ static void _a6xx_get_gmu_registers(struct msm_gpu *gpu, regcount += RANGE(regs->registers, i); obj->handle = (const void *) regs; - obj->data = kcalloc(regcount, sizeof(u32), GFP_KERNEL); + obj->data = state_kcalloc(a6xx_state, regcount, sizeof(u32)); if (!obj->data) return; @@ -703,8 +754,8 @@ static void a6xx_get_gmu_registers(struct msm_gpu *gpu, struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); - a6xx_state->gmu_registers = kcalloc(2, - sizeof(*a6xx_state->gmu_registers), GFP_KERNEL); + a6xx_state->gmu_registers = state_kcalloc(a6xx_state, + 2, sizeof(*a6xx_state->gmu_registers)); if (!a6xx_state->gmu_registers) return; @@ -712,7 +763,7 @@ static void a6xx_get_gmu_registers(struct msm_gpu *gpu, a6xx_state->nr_gmu_registers = 2; /* Get the CX GMU registers from AHB */ - _a6xx_get_gmu_registers(gpu, &a6xx_gmu_reglist[0], + _a6xx_get_gmu_registers(gpu, a6xx_state, &a6xx_gmu_reglist[0], &a6xx_state->gmu_registers[0]); if (!a6xx_gmu_gx_is_on(&a6xx_gpu->gmu)) @@ -721,7 +772,7 @@ static void a6xx_get_gmu_registers(struct msm_gpu *gpu, /* Set the fence to ALLOW mode so we can access the registers */ gpu_write(gpu, REG_A6XX_GMU_AO_AHB_FENCE_CTRL, 0); - _a6xx_get_gmu_registers(gpu, &a6xx_gmu_reglist[1], + _a6xx_get_gmu_registers(gpu, a6xx_state, &a6xx_gmu_reglist[1], &a6xx_state->gmu_registers[1]); } @@ -734,8 +785,8 @@ static void a6xx_get_registers(struct msm_gpu *gpu, ARRAY_SIZE(a6xx_hlsq_reglist); int index = 0; - a6xx_state->registers = kcalloc(count, sizeof(*a6xx_state->registers), - GFP_KERNEL); + a6xx_state->registers = state_kcalloc(a6xx_state, + count, sizeof(*a6xx_state->registers)); if (!a6xx_state->registers) return; @@ -744,31 +795,32 @@ static void a6xx_get_registers(struct msm_gpu *gpu, for (i = 0; i < ARRAY_SIZE(a6xx_ahb_reglist); i++) a6xx_get_ahb_gpu_registers(gpu, - &a6xx_ahb_reglist[i], + a6xx_state, &a6xx_ahb_reglist[i], &a6xx_state->registers[index++]); for (i = 0; i < ARRAY_SIZE(a6xx_reglist); i++) a6xx_get_crashdumper_registers(gpu, - &a6xx_reglist[i], + a6xx_state, &a6xx_reglist[i], &a6xx_state->registers[index++], dumper); for (i = 0; i < ARRAY_SIZE(a6xx_hlsq_reglist); i++) a6xx_get_crashdumper_hlsq_registers(gpu, - &a6xx_hlsq_reglist[i], + a6xx_state, &a6xx_hlsq_reglist[i], &a6xx_state->registers[index++], dumper); } /* Read a block of data from an indexed register pair */ static void a6xx_get_indexed_regs(struct msm_gpu *gpu, + struct a6xx_gpu_state *a6xx_state, const struct a6xx_indexed_registers *indexed, struct a6xx_gpu_state_obj *obj) { int i; obj->handle = (const void *) indexed; - obj->data = kcalloc(indexed->count, sizeof(u32), GFP_KERNEL); + obj->data = state_kcalloc(a6xx_state, indexed->count, sizeof(u32)); if (!obj->data) return; @@ -787,13 +839,13 @@ static void a6xx_get_indexed_registers(struct msm_gpu *gpu, int count = ARRAY_SIZE(a6xx_indexed_reglist) + 1; int i; - a6xx_state->indexed_regs = kcalloc(count, - sizeof(a6xx_state->indexed_regs), GFP_KERNEL); + a6xx_state->indexed_regs = state_kcalloc(a6xx_state, count, + sizeof(a6xx_state->indexed_regs)); if (!a6xx_state->indexed_regs) return; for (i = 0; i < ARRAY_SIZE(a6xx_indexed_reglist); i++) - a6xx_get_indexed_regs(gpu, &a6xx_indexed_reglist[i], + a6xx_get_indexed_regs(gpu, a6xx_state, &a6xx_indexed_reglist[i], &a6xx_state->indexed_regs[i]); /* Set the CP mempool size to 0 to stabilize it while dumping */ @@ -801,7 +853,7 @@ static void a6xx_get_indexed_registers(struct msm_gpu *gpu, gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, 0); /* Get the contents of the CP mempool */ - a6xx_get_indexed_regs(gpu, &a6xx_cp_mempool_indexed, + a6xx_get_indexed_regs(gpu, a6xx_state, &a6xx_cp_mempool_indexed, &a6xx_state->indexed_regs[i]); /* @@ -827,6 +879,8 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu) if (!a6xx_state) return ERR_PTR(-ENOMEM); + INIT_LIST_HEAD(&a6xx_state->objs); + /* Get the generic state from the adreno core */ adreno_gpu_state_get(gpu, &a6xx_state->base); @@ -856,56 +910,14 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu) void a6xx_gpu_state_destroy(struct kref *kref) { + struct a6xx_state_memobj *obj, *tmp; struct msm_gpu_state *state = container_of(kref, struct msm_gpu_state, ref); struct a6xx_gpu_state *a6xx_state = container_of(state, struct a6xx_gpu_state, base); - int i; - - for (i = 0; i < a6xx_state->nr_gmu_registers; i++) - kfree(a6xx_state->gmu_registers[i].data); - - kfree(a6xx_state->gmu_registers); - - for (i = 0; i < a6xx_state->nr_registers; i++) - kfree(a6xx_state->registers[i].data); - - kfree(a6xx_state->registers); - - for (i = 0; i < a6xx_state->nr_shaders; i++) - kfree(a6xx_state->shaders[i].data); - - kfree(a6xx_state->shaders); - - for (i = 0; i < a6xx_state->nr_clusters; i++) - kfree(a6xx_state->clusters[i].data); - - kfree(a6xx_state->clusters); - - for (i = 0; i < a6xx_state->nr_dbgahb_clusters; i++) - kfree(a6xx_state->dbgahb_clusters[i].data); - - kfree(a6xx_state->dbgahb_clusters); - - for (i = 0; i < a6xx_state->nr_indexed_regs; i++) - kfree(a6xx_state->indexed_regs[i].data); - - kfree(a6xx_state->indexed_regs); - - for (i = 0; i < a6xx_state->nr_debugbus; i++) - kfree(a6xx_state->debugbus[i].data); - - kfree(a6xx_state->debugbus); - - if (a6xx_state->vbif_debugbus) - kfree(a6xx_state->vbif_debugbus->data); - - kfree(a6xx_state->vbif_debugbus); - - for (i = 0; i < a6xx_state->nr_cx_debugbus; i++) - kfree(a6xx_state->cx_debugbus[i].data); - kfree(a6xx_state->cx_debugbus); + list_for_each_entry_safe(obj, tmp, &a6xx_state->objs, node) + kfree(obj); adreno_gpu_state_destroy(state); kfree(a6xx_state);