From patchwork Fri Jan 26 21:08:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jordan Crouse X-Patchwork-Id: 10187033 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 28BA1602C8 for ; Fri, 26 Jan 2018 21:09:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 194682A80F for ; Fri, 26 Jan 2018 21:09:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0CD202A822; Fri, 26 Jan 2018 21:09:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 466E82A80F for ; Fri, 26 Jan 2018 21:09:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3031D6E80A; Fri, 26 Jan 2018 21:09:06 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by gabe.freedesktop.org (Postfix) with ESMTPS id 46FB16E767; Fri, 26 Jan 2018 21:08:58 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 2CCB06050D; Fri, 26 Jan 2018 21:08:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1517000938; bh=BehhzhYORISiR9TV2NgadYyTrTD776r1jw3Lpnqdlpc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HOB97gpgzsDncOWz9TgPy1Ho3Y/aAfuu7DTO5wsM+TOzGUrXSdiZhPNWZirm2di9O OoM7OCLDZzKUp4BRGzne5IjF77PRn5ZmzxKMzU6ePOgV/IhL0R83Q16shKRCJws6IB vphvR2XpXjj54ZeF7A1xg9RRvib1OoDrluFvrYLU= Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 9C79C60A5F; Fri, 26 Jan 2018 21:08:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1517000937; bh=BehhzhYORISiR9TV2NgadYyTrTD776r1jw3Lpnqdlpc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gCXSjiOL9m5c5+VHsq9C+tHpyZPpWk3xuOnA65fvkGgo/cEY0XTDSCSK7epNr5nBV QYN7vOATRVMIsVOIp5vR/ChbQ5907gql7nZlUB86jHC61dyJlVmcR6wpMPw+HP2LUj 35uGNpJCQaytBADFYdlu/pFYGc61qFd/AlVkng3I= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 9C79C60A5F Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=jcrouse@codeaurora.org From: Jordan Crouse To: freedreno@lists.freedesktop.org Subject: [PATCH 6/6] drm/msm/adreno: Add a5xx specific registers for the GPU state Date: Fri, 26 Jan 2018 14:08:50 -0700 Message-Id: <1517000930-1893-7-git-send-email-jcrouse@codeaurora.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1517000930-1893-1-git-send-email-jcrouse@codeaurora.org> References: <1517000930-1893-1-git-send-email-jcrouse@codeaurora.org> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-msm@vger.kernel.org, dri-devel@lists.freedesktop.org MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP HLSQ, SP and TP registers are only accessible from a special aperture and to make matters worsex, the aperture is blocked from the CPU on targets that can support secure rendering. Luckily the GPU hardware has its own purpose built register dumper that can access the registers from the aperture. Add a5xx specific code to program the crashdumper and retrieve the wayward registers and dump them for the crash state. Also, remove a block of registers the regular CPU accessible list that aren't useful for debug which helps reduce the size of the crash state file by a goodly amount. Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 8 +- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 8 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 234 ++++++++++++++++++++++++++++++-- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 23 ++-- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 4 +- 5 files changed, 246 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c index 72e523f1..14340e3 100644 --- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c @@ -420,10 +420,12 @@ static void a3xx_dump(struct msm_gpu *gpu) static struct msm_gpu_state *a3xx_gpu_state_get(struct msm_gpu *gpu) { - struct msm_gpu_state *state = adreno_gpu_state_get(gpu); + struct msm_gpu_state *state = kzalloc(sizeof(*state), GFP_KERNEL); - if (IS_ERR(state)) - return state; + if (!state) + return ERR_PTR(-ENOMEM); + + adreno_gpu_state_get(gpu, state); state->rbbm_status = gpu_read(gpu, REG_A3XX_RBBM_STATUS); diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c index e363e65..4dde681 100644 --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c @@ -456,10 +456,12 @@ static irqreturn_t a4xx_irq(struct msm_gpu *gpu) static struct msm_gpu_state *a4xx_gpu_state_get(struct msm_gpu *gpu) { - struct msm_gpu_state *state = adreno_gpu_state_get(gpu); + struct msm_gpu_state *state = kzalloc(sizeof(*state), GFP_KERNEL); - if (IS_ERR(state)) - return state; + if (!state) + return ERR_PTR(-ENOMEM); + + adreno_gpu_state_get(gpu, state); state->rbbm_status = gpu_read(gpu, REG_A4XX_RBBM_STATUS); diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 971aaee..1fd158b 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -19,6 +19,7 @@ #include #include #include +#include #include "msm_gem.h" #include "msm_mmu.h" #include "a5xx_gpu.h" @@ -1077,8 +1078,9 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu) 0xE800, 0xE806, 0xE810, 0xE89A, 0xE8A0, 0xE8A4, 0xE8AA, 0xE8EB, 0xE900, 0xE905, 0xEB80, 0xEB8F, 0xEBB0, 0xEBB0, 0xEC00, 0xEC05, 0xEC08, 0xECE9, 0xECF0, 0xECF0, 0xEA80, 0xEA80, 0xEA82, 0xEAA3, - 0xEAA5, 0xEAC2, 0xA800, 0xA8FF, 0xAC60, 0xAC60, 0xB000, 0xB97F, - 0xB9A0, 0xB9BF, ~0 + 0xEAA5, 0xEAC2, 0xA800, 0xA800, 0xA820, 0xA828, 0xA840, 0xA87D, + 0XA880, 0xA88D, 0xA890, 0xA8A3, 0xA8D0, 0xA8D8, 0xA8E0, 0xA8F5, + 0xAC60, 0xAC60, ~0, }; static void a5xx_dump(struct msm_gpu *gpu) @@ -1149,24 +1151,230 @@ static int a5xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value) return 0; } +struct a5xx_crashdumper { + void *ptr; + struct drm_gem_object *bo; + u64 iova; +}; + +struct a5xx_gpu_state { + struct msm_gpu_state base; + u32 *hlsqregs; +}; + +#define gpu_poll_timeout(gpu, addr, val, cond, interval, timeout) \ + readl_poll_timeout((gpu)->mmio + ((addr) << 2), val, cond, \ + interval, timeout) + +static int a5xx_crashdumper_init(struct msm_gpu *gpu, + struct a5xx_crashdumper *dumper) +{ + dumper->ptr = msm_gem_kernel_new_locked(gpu->dev, + SZ_1M, MSM_BO_UNCACHED, gpu->aspace, + &dumper->bo, &dumper->iova); + + if (IS_ERR(dumper->ptr)) + return PTR_ERR(dumper->ptr); + + return 0; +} + +static void a5xx_crashdumper_free(struct msm_gpu *gpu, + struct a5xx_crashdumper *dumper) +{ + msm_gem_put_iova(dumper->bo, gpu->aspace); + + drm_gem_object_unreference(dumper->bo); +} + +static int a5xx_crashdumper_run(struct msm_gpu *gpu, + struct a5xx_crashdumper *dumper) +{ + u32 val; + + if (IS_ERR_OR_NULL(dumper->ptr)) + return -EINVAL; + + gpu_write64(gpu, REG_A5XX_CP_CRASH_SCRIPT_BASE_LO, + REG_A5XX_CP_CRASH_SCRIPT_BASE_HI, dumper->iova); + + gpu_write(gpu, REG_A5XX_CP_CRASH_DUMP_CNTL, 1); + + return gpu_poll_timeout(gpu, REG_A5XX_CP_CRASH_DUMP_CNTL, val, + val & 0x04, 100, 10000); +} + +/* + * These are a list of the registers that need to be read through the HLSQ + * aperture through the crashdumper. These are not nominally accessible from + * the CPU on a secure platform. + */ +static const struct { + u32 type; + u32 regoffset; + u32 count; +} a5xx_hlsq_aperture_regs[] = { + { 0x35, 0xe00, 0x32 }, /* HSLQ non-context */ + { 0x31, 0x2080, 0x1 }, /* HLSQ 2D context 0 */ + { 0x33, 0x2480, 0x1 }, /* HLSQ 2D context 1 */ + { 0x32, 0xe780, 0x62 }, /* HLSQ 3D context 0 */ + { 0x34, 0xef80, 0x62 }, /* HLSQ 3D context 1 */ + { 0x3f, 0x0ec0, 0x40 }, /* SP non-context */ + { 0x3d, 0x2040, 0x1 }, /* SP 2D context 0 */ + { 0x3b, 0x2440, 0x1 }, /* SP 2D context 1 */ + { 0x3e, 0xe580, 0x170 }, /* SP 3D context 0 */ + { 0x3c, 0xed80, 0x170 }, /* SP 3D context 1 */ + { 0x3a, 0x0f00, 0x1c }, /* TP non-context */ + { 0x38, 0x2000, 0xa }, /* TP 2D context 0 */ + { 0x36, 0x2400, 0xa }, /* TP 2D context 1 */ + { 0x39, 0xe700, 0x80 }, /* TP 3D context 0 */ + { 0x37, 0xef00, 0x80 }, /* TP 3D context 1 */ +}; + +static void a5xx_gpu_state_get_hlsq_regs(struct msm_gpu *gpu, + struct a5xx_gpu_state *a5xx_state) +{ + struct a5xx_crashdumper dumper = { 0 }; + u32 offset, count = 0; + u64 *ptr; + int i; + + if (a5xx_crashdumper_init(gpu, &dumper)) + return; + + /* The script will be written at offset 0 */ + ptr = dumper.ptr; + + /* Start writing the data at offset 256k */ + offset = dumper.iova + (256 * SZ_1K); + + /* Count how many additional registers to get from the HLSQ aperture */ + for (i = 0; i < ARRAY_SIZE(a5xx_hlsq_aperture_regs); i++) + count += a5xx_hlsq_aperture_regs[i].count; + + a5xx_state->hlsqregs = kcalloc(count, sizeof(u32), GFP_KERNEL); + if (!a5xx_state->hlsqregs) + return; + + /* Build the crashdump script */ + for (i = 0; i < ARRAY_SIZE(a5xx_hlsq_aperture_regs); i++) { + u32 type = a5xx_hlsq_aperture_regs[i].type; + u32 c = a5xx_hlsq_aperture_regs[i].count; + + /* Write the register to select the desired bank */ + *ptr++ = type << 8; + *ptr++ = (((u64) REG_A5XX_HLSQ_DBG_READ_SEL) << 44) | + (1 << 21) | 1; + + *ptr++ = offset; + *ptr++ = (((u64) REG_A5XX_HLSQ_DBG_AHB_READ_APERTURE) << 44) + | c; + + offset += c * sizeof(u32); + } + + /* Write two zeros to close off the script */ + *ptr++ = 0; + *ptr++ = 0; + + if (a5xx_crashdumper_run(gpu, &dumper)) { + kfree(a5xx_state->hlsqregs); + a5xx_crashdumper_free(gpu, &dumper); + return; + } + + /* Copy the data from the crashdumper to the state */ + memcpy(a5xx_state->hlsqregs, dumper.ptr + (256 * SZ_1K), + count * sizeof(u32)); + + a5xx_crashdumper_free(gpu, &dumper); +} + static struct msm_gpu_state *a5xx_gpu_state_get(struct msm_gpu *gpu) { - struct msm_gpu_state *state; + struct a5xx_gpu_state *a5xx_state = kzalloc(sizeof(*a5xx_state), + GFP_KERNEL); - /* - * Temporarily disable hardware clock gating before going into - * adreno_show to avoid issues while reading the registers - */ + if (!a5xx_state) + return ERR_PTR(-ENOMEM); + + /* Temporarily disable hardware clock gating before reading the hw */ a5xx_set_hwcg(gpu, false); - state = adreno_gpu_state_get(gpu); + /* First get the generic state from the adreno core */ + adreno_gpu_state_get(gpu, &(a5xx_state->base)); + + a5xx_state->base.rbbm_status = gpu_read(gpu, REG_A5XX_RBBM_STATUS); - if (!IS_ERR(state)) - state->rbbm_status = gpu_read(gpu, REG_A5XX_RBBM_STATUS); + /* Get the HLSQ regs with the help of the crashdumper */ + a5xx_gpu_state_get_hlsq_regs(gpu, a5xx_state); a5xx_set_hwcg(gpu, true); - return state; + return &a5xx_state->base; +} + +static void a5xx_gpu_state_destroy(struct kref *kref) +{ + struct msm_gpu_state *state = container_of(kref, + struct msm_gpu_state, ref); + struct a5xx_gpu_state *a5xx_state = container_of(state, + struct a5xx_gpu_state, base); + + kfree(a5xx_state->hlsqregs); + + adreno_gpu_state_destroy(state); + kfree(a5xx_state); +} + +int a5xx_gpu_state_put(struct msm_gpu_state *state) +{ + if (IS_ERR_OR_NULL(state)) + return 1; + + return kref_put(&state->ref, a5xx_gpu_state_destroy); +} + + +void a5xx_show(struct msm_gpu *gpu, struct msm_gpu_state *state, + struct seq_file *m) +{ + int i, j; + u32 pos = 0; + struct a5xx_gpu_state *a5xx_state = container_of(state, + struct a5xx_gpu_state, base); + + if (IS_ERR_OR_NULL(state)) + return; + + adreno_show(gpu, state, m); + + /* Dump the additional a5xx HLSQ registers */ + if (!a5xx_state->hlsqregs) + return; + + seq_puts(m, "registers:\n"); + seq_puts(m, " - [offset, value]\n"); + + for (i = 0; i < ARRAY_SIZE(a5xx_hlsq_aperture_regs); i++) { + u32 o = a5xx_hlsq_aperture_regs[i].regoffset; + u32 c = a5xx_hlsq_aperture_regs[i].count; + + for (j = 0; j < c; j++, pos++, o++) { + /* + * To keep the crashdump simple we pull the entire range + * for each register type but not all of the registers + * in the range are valid. Fortunately invalid registers + * stick out like a sore thumb with a value of + * 0xdeadbeef + */ + if (a5xx_state->hlsqregs[pos] == 0xdeadbeef) + continue; + + seq_printf(m, " - [0x%04x, 0x%08x]\n", + o << 2, a5xx_state->hlsqregs[pos]); + } + } } static struct msm_ringbuffer *a5xx_active_ring(struct msm_gpu *gpu) @@ -1198,11 +1406,11 @@ static int a5xx_gpu_busy(struct msm_gpu *gpu, uint64_t *value) .irq = a5xx_irq, .destroy = a5xx_destroy, #ifdef CONFIG_DEBUG_FS - .show = adreno_show, + .show = a5xx_show, #endif .gpu_busy = a5xx_gpu_busy, .gpu_state_get = a5xx_gpu_state_get, - .gpu_state_put = adreno_gpu_state_put, + .gpu_state_put = a5xx_gpu_state_put, }, .get_timestamp = a5xx_get_timestamp, }; diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 79df82b..fd39b3c 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -363,16 +363,11 @@ bool adreno_idle(struct msm_gpu *gpu, struct msm_ringbuffer *ring) return false; } -struct msm_gpu_state *adreno_gpu_state_get(struct msm_gpu *gpu) +int adreno_gpu_state_get(struct msm_gpu *gpu, struct msm_gpu_state *state) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); - struct msm_gpu_state *state; int i, count = 0; - state = kzalloc(sizeof(*state), GFP_KERNEL); - if (!state) - return ERR_PTR(-ENOMEM); - kref_init(&state->ref); do_gettimeofday(&state->time); @@ -426,14 +421,12 @@ struct msm_gpu_state *adreno_gpu_state_get(struct msm_gpu *gpu) state->nr_registers = count; } - return state; + return 0; } -static void adreno_gpu_state_destroy(struct kref *kref) +void adreno_gpu_state_destroy(struct msm_gpu_state *state) { int i; - struct msm_gpu_state *state = container_of(kref, - struct msm_gpu_state, ref); for(i = 0; i < ARRAY_SIZE(state->ring); i++) kfree(state->ring[i].data); @@ -441,6 +434,14 @@ static void adreno_gpu_state_destroy(struct kref *kref) kfree(state->comm); kfree(state->cmd); kfree(state->registers); +} + +static void adreno_gpu_state_kref_destroy(struct kref *kref) +{ + struct msm_gpu_state *state = container_of(kref, + struct msm_gpu_state, ref); + + adreno_gpu_state_destroy(state); kfree(state); } @@ -449,7 +450,7 @@ int adreno_gpu_state_put(struct msm_gpu_state *state) if (IS_ERR_OR_NULL(state)) return 1; - return kref_put(&state->ref, adreno_gpu_state_destroy); + return kref_put(&state->ref, adreno_gpu_state_kref_destroy); } #ifdef CONFIG_DEBUG_FS diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h index bcf755e..313e66a 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h @@ -220,7 +220,9 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev, int nr_rings); void adreno_gpu_cleanup(struct adreno_gpu *gpu); -struct msm_gpu_state *adreno_gpu_state_get(struct msm_gpu *gpu); +void adreno_gpu_state_destroy(struct msm_gpu_state *state); + +int adreno_gpu_state_get(struct msm_gpu *gpu, struct msm_gpu_state *state); int adreno_gpu_state_put(struct msm_gpu_state *state); /* ringbuffer helpers (the parts that are adreno specific) */