From patchwork Tue Jun 1 22:47:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rob Clark X-Patchwork-Id: 12292379 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87EDFC47080 for ; Tue, 1 Jun 2021 22:44:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 64F0A61159 for ; Tue, 1 Jun 2021 22:44:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235059AbhFAWqB (ORCPT ); Tue, 1 Jun 2021 18:46:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235026AbhFAWqA (ORCPT ); Tue, 1 Jun 2021 18:46:00 -0400 Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A69BC061756; Tue, 1 Jun 2021 15:44:18 -0700 (PDT) Received: by mail-pj1-x1036.google.com with SMTP id d5-20020a17090ab305b02901675357c371so1976801pjr.1; Tue, 01 Jun 2021 15:44:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=f8TBvuEhJIYSx02uYrz2RNVyn3Huf1vKVuYXCtSr4og=; b=k/PB6bvkWkvSgogGIYMhCtAdJ0mvCmfJfEVvf7QncXwzBg2k2fk14ifPK0KgHJNLos kzvHtPyuApp5iVxnADmD41EMosbEUGtYga2eRNuKGSl7BIAY74/0uXIsYpdQjIInJ3q+ ss1qcRZD6NoHfwG1pqgH0A5z0AGn8f6xNWLTVggnuirneqbKvHAqTJCXQqtgsyZt4pgB +5m8T309VhmK7VEFXUw13i+jt4XbpbpqPlTPgRQvCGx0YpojKySY3Xv0lLXJJYVNEBSu JweZoy/ZzlvR70cEWD8d8iGrZVF8fnWEaeQxPbxvM9M/FObwTKZezXhC64u/2Or9CSkh EpIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=f8TBvuEhJIYSx02uYrz2RNVyn3Huf1vKVuYXCtSr4og=; b=HFhuK6FvttnvqhOBHUPlOWPn4T7MfaFL4ol97+GYX2pT9ikN2lmzkYE2Iqfy9jOsyZ CfDXwMl1uGWE2c05bRsCiDFYNtEY5bOWcTTZBEDoiJ27r4/JT0yO3COoUKzYZnGXCl56 Pg1pWWVIqtrkQMERE9RRJt65Rl1+ni4yvmbrseJOrZrl8/1YOIunknGF5oK8Jvb5VCpY FI0pKHvWeBss7g+o2SiUVboZ1/5Q4WB2jcdwcluuOsivQe4JxOLRX7CxxkNkb6isk7Ck zWNDNRZuHHQMkGlIIaO05ph363Ih+AuPE2lEATGMYRDo4F6sanv3DhfOtCWs+awJD6uT ct4w== X-Gm-Message-State: AOAM530ilABTknl7TUBSiGT0+PZ2fxy4unrpI5XK6pIXbUAmQFMJhgNJ EBWJctEUVl+D4rPRdDlIIrE= X-Google-Smtp-Source: ABdhPJzHoeK4rxw7lnJietI1BhJLhslhGDKNcFVd62Bh/uRAUklK4pKybYSgWYKXeqMUpca3bydJOA== X-Received: by 2002:a17:902:e551:b029:103:c082:ba with SMTP id n17-20020a170902e551b0290103c08200bamr14935051plf.3.1622587458135; Tue, 01 Jun 2021 15:44:18 -0700 (PDT) Received: from localhost (c-73-25-156-94.hsd1.or.comcast.net. [73.25.156.94]) by smtp.gmail.com with ESMTPSA id s123sm13990015pfb.78.2021.06.01.15.44.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Jun 2021 15:44:17 -0700 (PDT) From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: Jordan Crouse , Jordan Crouse , Rob Clark , Rob Clark , Sean Paul , David Airlie , Daniel Vetter , AngeloGioacchino Del Regno , Konrad Dybcio , "Kristian H. Kristensen" , Marijn Suijten , Jonathan Marek , Akhil P Oommen , Sai Prakash Ranjan , Eric Anholt , Sharat Masetty , Douglas Anderson , linux-arm-msm@vger.kernel.org (open list:DRM DRIVER FOR MSM ADRENO GPU), freedreno@lists.freedesktop.org (open list:DRM DRIVER FOR MSM ADRENO GPU), linux-kernel@vger.kernel.org (open list) Subject: [PATCH v4 3/6] drm/msm: Improve the a6xx page fault handler Date: Tue, 1 Jun 2021 15:47:22 -0700 Message-Id: <20210601224750.513996-5-robdclark@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210601224750.513996-1-robdclark@gmail.com> References: <20210601224750.513996-1-robdclark@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org From: Jordan Crouse Use the new adreno-smmu-priv fault info function to get more SMMU debug registers and print the current TTBR0 to debug per-instance pagetables and figure out which GPU block generated the request. Signed-off-by: Jordan Crouse Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 76 +++++++++++++++++++++++++-- drivers/gpu/drm/msm/msm_iommu.c | 11 +++- drivers/gpu/drm/msm/msm_mmu.h | 4 +- 4 files changed, 87 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index ce13d49e615b..a0eef5d9b89b 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1075,7 +1075,7 @@ bool a5xx_idle(struct msm_gpu *gpu, struct msm_ringbuffer *ring) return true; } -static int a5xx_fault_handler(void *arg, unsigned long iova, int flags) +static int a5xx_fault_handler(void *arg, unsigned long iova, int flags, void *data) { struct msm_gpu *gpu = arg; pr_warn_ratelimited("*** gpu fault: iova=%08lx, flags=%d (%u,%u,%u,%u)\n", @@ -1085,7 +1085,7 @@ static int a5xx_fault_handler(void *arg, unsigned long iova, int flags) gpu_read(gpu, REG_A5XX_CP_SCRATCH_REG(6)), gpu_read(gpu, REG_A5XX_CP_SCRATCH_REG(7))); - return -EFAULT; + return 0; } static void a5xx_cp_err_irq(struct msm_gpu *gpu) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 23464d735682..094dc17fd20f 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -959,18 +959,88 @@ static void a6xx_recover(struct msm_gpu *gpu) msm_gpu_hw_init(gpu); } -static int a6xx_fault_handler(void *arg, unsigned long iova, int flags) +static const char *a6xx_uche_fault_block(struct msm_gpu *gpu, u32 mid) +{ + static const char *uche_clients[7] = { + "VFD", "SP", "VSC", "VPC", "HLSQ", "PC", "LRZ", + }; + u32 val; + + if (mid < 1 || mid > 3) + return "UNKNOWN"; + + /* + * The source of the data depends on the mid ID read from FSYNR1. + * and the client ID read from the UCHE block + */ + val = gpu_read(gpu, REG_A6XX_UCHE_CLIENT_PF); + + /* mid = 3 is most precise and refers to only one block per client */ + if (mid == 3) + return uche_clients[val & 7]; + + /* For mid=2 the source is TP or VFD except when the client id is 0 */ + if (mid == 2) + return ((val & 7) == 0) ? "TP" : "TP|VFD"; + + /* For mid=1 just return "UCHE" as a catchall for everything else */ + return "UCHE"; +} + +static const char *a6xx_fault_block(struct msm_gpu *gpu, u32 id) +{ + if (id == 0) + return "CP"; + else if (id == 4) + return "CCU"; + else if (id == 6) + return "CDP Prefetch"; + + return a6xx_uche_fault_block(gpu, id); +} + +#define ARM_SMMU_FSR_TF BIT(1) +#define ARM_SMMU_FSR_PF BIT(3) +#define ARM_SMMU_FSR_EF BIT(4) + +static int a6xx_fault_handler(void *arg, unsigned long iova, int flags, void *data) { struct msm_gpu *gpu = arg; + struct adreno_smmu_fault_info *info = data; + const char *type = "UNKNOWN"; - pr_warn_ratelimited("*** gpu fault: iova=%08lx, flags=%d (%u,%u,%u,%u)\n", + /* + * Print a default message if we couldn't get the data from the + * adreno-smmu-priv + */ + if (!info) { + pr_warn_ratelimited("*** gpu fault: iova=%.16lx flags=%d (%u,%u,%u,%u)\n", iova, flags, gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(4)), gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(5)), gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(6)), gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(7))); - return -EFAULT; + return 0; + } + + if (info->fsr & ARM_SMMU_FSR_TF) + type = "TRANSLATION"; + else if (info->fsr & ARM_SMMU_FSR_PF) + type = "PERMISSION"; + else if (info->fsr & ARM_SMMU_FSR_EF) + type = "EXTERNAL"; + + pr_warn_ratelimited("*** gpu fault: ttbr0=%.16llx iova=%.16lx dir=%s type=%s source=%s (%u,%u,%u,%u)\n", + info->ttbr0, iova, + flags & IOMMU_FAULT_WRITE ? "WRITE" : "READ", type, + a6xx_fault_block(gpu, info->fsynr1 & 0xff), + gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(4)), + gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(5)), + gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(6)), + gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(7))); + + return 0; } static void a6xx_cp_hw_err_irq(struct msm_gpu *gpu) diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c index 50d881794758..6975b95c3c29 100644 --- a/drivers/gpu/drm/msm/msm_iommu.c +++ b/drivers/gpu/drm/msm/msm_iommu.c @@ -211,8 +211,17 @@ static int msm_fault_handler(struct iommu_domain *domain, struct device *dev, unsigned long iova, int flags, void *arg) { struct msm_iommu *iommu = arg; + struct adreno_smmu_priv *adreno_smmu = dev_get_drvdata(iommu->base.dev); + struct adreno_smmu_fault_info info, *ptr = NULL; + + if (adreno_smmu->get_fault_info) { + adreno_smmu->get_fault_info(adreno_smmu->cookie, &info); + ptr = &info; + } + if (iommu->base.handler) - return iommu->base.handler(iommu->base.arg, iova, flags); + return iommu->base.handler(iommu->base.arg, iova, flags, ptr); + pr_warn_ratelimited("*** fault: iova=%16lx, flags=%d\n", iova, flags); return 0; } diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h index 61ade89d9e48..a88f44c3268d 100644 --- a/drivers/gpu/drm/msm/msm_mmu.h +++ b/drivers/gpu/drm/msm/msm_mmu.h @@ -26,7 +26,7 @@ enum msm_mmu_type { struct msm_mmu { const struct msm_mmu_funcs *funcs; struct device *dev; - int (*handler)(void *arg, unsigned long iova, int flags); + int (*handler)(void *arg, unsigned long iova, int flags, void *data); void *arg; enum msm_mmu_type type; }; @@ -43,7 +43,7 @@ struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain); struct msm_mmu *msm_gpummu_new(struct device *dev, struct msm_gpu *gpu); static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg, - int (*handler)(void *arg, unsigned long iova, int flags)) + int (*handler)(void *arg, unsigned long iova, int flags, void *data)) { mmu->arg = arg; mmu->handler = handler; From patchwork Tue Jun 1 22:47:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rob Clark X-Patchwork-Id: 12292381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA7CCC47080 for ; Tue, 1 Jun 2021 22:44:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C5726613BD for ; Tue, 1 Jun 2021 22:44:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235049AbhFAWqR (ORCPT ); Tue, 1 Jun 2021 18:46:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235080AbhFAWqQ (ORCPT ); Tue, 1 Jun 2021 18:46:16 -0400 Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2F4BC06174A; Tue, 1 Jun 2021 15:44:32 -0700 (PDT) Received: by mail-pj1-x1033.google.com with SMTP id jz2-20020a17090b14c2b0290162cf0b5a35so2290606pjb.5; Tue, 01 Jun 2021 15:44:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=isnUI9Z+Tpjtmf0oq/z8ATjisD/zxiV9leo9GMWh2Uc=; b=X5G9mBG4Y/eCuBrfzxz6UahNem6fINc+0Z169n9t+3K+MOijJ+V9i52HYOW8ft1Vgi 2EMPVBwHA5k7FDGQQlP5koqyYWQTbqFy2sxadz9Vk0o1y4aXioQphzUbaqASqb94jIIi jZrGx02JSk4TIqhilRq3qlQFGQTLMviYSbA7NEET/dtfr5xxo4GqjQHDgXaXCmZgYCiR r5Dgn8tzzNv2n53vNMuix3iBEiJxIZ4hq/N7f2XwjHv6I5GngF5Q/lE9YYQO001fCQYP DTaeil1r2eV8DZCsVUZmC+k7nEi77HLnjZpTh/3XjSn8ZaXrUXwjXRQB/5k2R0AxCZZe ow4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=isnUI9Z+Tpjtmf0oq/z8ATjisD/zxiV9leo9GMWh2Uc=; b=nwqNXSGX1Yj0OWgNLFK0vhKRodwiOeRBrYGzPDmUQxPcoFy496ixK+2o9JchxqNidz gmQI0DmHaT9p/XlXmO/btS0IEN1sKEosLegmnweHG2bULrX0rdQHYe5Ye6VSiUxKHmm5 AQ6SpcslRryI/pucWKy5/9ES+6xjlgcfZx39ff7wrODc1ZTLb7wAWXjpHpzBqtPCiO5h ue1EbARls5E5wRPrmn7hDS0Mww+F1RJiFQMLiwJ0HHB0O/KA9XcoC3zXT0NdfWz5WBuK hzVpf5yAFOuFGe1XX2Hk1JZxByqOhHWjPmFB5pHfNmwoK9SnintMmKWFf1AybD7UOOeI EhwQ== X-Gm-Message-State: AOAM533xP8zNYBG9KdYzx7glHRsHwZuFfoBF3ReNaridFBRqMs7VaZBD Vl9M53pn8JqqK+XkRGzcv2g= X-Google-Smtp-Source: ABdhPJxAwMQBE9H7tRRFvZO0yFNJxRhug6kXTBJYrI01XcGntlFa4d97Fhhxj7yroyQyi/v5KyReTA== X-Received: by 2002:a17:902:904a:b029:101:af84:4f55 with SMTP id w10-20020a170902904ab0290101af844f55mr20473811plz.80.1622587472459; Tue, 01 Jun 2021 15:44:32 -0700 (PDT) Received: from localhost (c-73-25-156-94.hsd1.or.comcast.net. [73.25.156.94]) by smtp.gmail.com with ESMTPSA id d15sm11785668pgu.84.2021.06.01.15.44.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Jun 2021 15:44:31 -0700 (PDT) From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: Jordan Crouse , Rob Clark , Rob Clark , Sean Paul , David Airlie , Daniel Vetter , Iskren Chernev , Akhil P Oommen , AngeloGioacchino Del Regno , Konrad Dybcio , "Kristian H. Kristensen" , Marijn Suijten , Sai Prakash Ranjan , Sharat Masetty , Jonathan Marek , Zhenzhong Duan , Lee Jones , linux-arm-msm@vger.kernel.org (open list:DRM DRIVER FOR MSM ADRENO GPU), freedreno@lists.freedesktop.org (open list:DRM DRIVER FOR MSM ADRENO GPU), linux-kernel@vger.kernel.org (open list) Subject: [PATCH v4 5/6] drm/msm: Add crashdump support for stalled SMMU Date: Tue, 1 Jun 2021 15:47:24 -0700 Message-Id: <20210601224750.513996-7-robdclark@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210601224750.513996-1-robdclark@gmail.com> References: <20210601224750.513996-1-robdclark@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org From: Rob Clark For collecting devcoredumps with the SMMU stalled after an iova fault, we need to skip the parts of the GPU state which are normally collected with the hw crashdumper, since with the SMMU stalled the hw would be unable to write out the requested state to memory. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 2 +- drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 2 +- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 2 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 5 ++- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 43 ++++++++++++++++----- drivers/gpu/drm/msm/msm_debugfs.c | 2 +- drivers/gpu/drm/msm/msm_gpu.c | 7 ++-- drivers/gpu/drm/msm/msm_gpu.h | 2 +- 9 files changed, 47 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c index bdc989183c64..d2c31fae64fd 100644 --- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c @@ -434,7 +434,7 @@ static void a2xx_dump(struct msm_gpu *gpu) adreno_dump(gpu); } -static struct msm_gpu_state *a2xx_gpu_state_get(struct msm_gpu *gpu) +static struct msm_gpu_state *a2xx_gpu_state_get(struct msm_gpu *gpu, bool stalled) { struct msm_gpu_state *state = kzalloc(sizeof(*state), GFP_KERNEL); diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c index 4534633fe7cd..b1a6f87d74ef 100644 --- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c @@ -464,7 +464,7 @@ static void a3xx_dump(struct msm_gpu *gpu) adreno_dump(gpu); } -static struct msm_gpu_state *a3xx_gpu_state_get(struct msm_gpu *gpu) +static struct msm_gpu_state *a3xx_gpu_state_get(struct msm_gpu *gpu, bool stalled) { struct msm_gpu_state *state = kzalloc(sizeof(*state), GFP_KERNEL); diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c index 82bebb40234d..22780a594d6f 100644 --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c @@ -549,7 +549,7 @@ static const unsigned int a405_registers[] = { ~0 /* sentinel */ }; -static struct msm_gpu_state *a4xx_gpu_state_get(struct msm_gpu *gpu) +static struct msm_gpu_state *a4xx_gpu_state_get(struct msm_gpu *gpu, bool stalled) { struct msm_gpu_state *state = kzalloc(sizeof(*state), GFP_KERNEL); diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index a0eef5d9b89b..2e7714b1a17f 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1519,7 +1519,7 @@ static void a5xx_gpu_state_get_hlsq_regs(struct msm_gpu *gpu, msm_gem_kernel_put(dumper.bo, gpu->aspace, true); } -static struct msm_gpu_state *a5xx_gpu_state_get(struct msm_gpu *gpu) +static struct msm_gpu_state *a5xx_gpu_state_get(struct msm_gpu *gpu, bool stalled) { struct a5xx_gpu_state *a5xx_state = kzalloc(sizeof(*a5xx_state), GFP_KERNEL); @@ -1536,7 +1536,8 @@ static struct msm_gpu_state *a5xx_gpu_state_get(struct msm_gpu *gpu) a5xx_state->base.rbbm_status = gpu_read(gpu, REG_A5XX_RBBM_STATUS); /* Get the HLSQ regs with the help of the crashdumper */ - a5xx_gpu_state_get_hlsq_regs(gpu, a5xx_state); + if (!stalled) + a5xx_gpu_state_get_hlsq_regs(gpu, a5xx_state); a5xx_set_hwcg(gpu, true); diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index ce0610c5256f..e0f06ce4e1a9 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -86,7 +86,7 @@ unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu); void a6xx_show(struct msm_gpu *gpu, struct msm_gpu_state *state, struct drm_printer *p); -struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu); +struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu, bool stalled); int a6xx_gpu_state_put(struct msm_gpu_state *state); #endif /* __A6XX_GPU_H__ */ diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c index c1699b4f9a89..d0af68a76c4f 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c @@ -833,6 +833,21 @@ static void a6xx_get_registers(struct msm_gpu *gpu, a6xx_state, &a6xx_vbif_reglist, &a6xx_state->registers[index++]); + if (!dumper) { + /* + * We can't use the crashdumper when the SMMU is stalled, + * because the GPU has no memory access until we resume + * translation (but we don't want to do that until after + * we have captured as much useful GPU state as possible). + * So instead collect registers via the CPU: + */ + for (i = 0; i < ARRAY_SIZE(a6xx_reglist); i++) + a6xx_get_ahb_gpu_registers(gpu, + a6xx_state, &a6xx_reglist[i], + &a6xx_state->registers[index++]); + return; + } + for (i = 0; i < ARRAY_SIZE(a6xx_reglist); i++) a6xx_get_crashdumper_registers(gpu, a6xx_state, &a6xx_reglist[i], @@ -903,9 +918,9 @@ static void a6xx_get_indexed_registers(struct msm_gpu *gpu, a6xx_state->nr_indexed_regs = count; } -struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu) +struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu, bool stalled) { - struct a6xx_crashdumper dumper = { 0 }; + struct a6xx_crashdumper _dumper = { 0 }, *dumper = NULL; struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); struct a6xx_gpu_state *a6xx_state = kzalloc(sizeof(*a6xx_state), @@ -928,14 +943,24 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu) /* Get the banks of indexed registers */ a6xx_get_indexed_registers(gpu, a6xx_state); - /* Try to initialize the crashdumper */ - if (!a6xx_crashdumper_init(gpu, &dumper)) { - a6xx_get_registers(gpu, a6xx_state, &dumper); - a6xx_get_shaders(gpu, a6xx_state, &dumper); - a6xx_get_clusters(gpu, a6xx_state, &dumper); - a6xx_get_dbgahb_clusters(gpu, a6xx_state, &dumper); + /* + * Try to initialize the crashdumper, if we are not dumping state + * with the SMMU stalled. The crashdumper needs memory access to + * write out GPU state, so we need to skip this when the SMMU is + * stalled in response to an iova fault + */ + if (!stalled && !a6xx_crashdumper_init(gpu, &_dumper)) { + dumper = &_dumper; + } + + a6xx_get_registers(gpu, a6xx_state, dumper); + + if (dumper) { + a6xx_get_shaders(gpu, a6xx_state, dumper); + a6xx_get_clusters(gpu, a6xx_state, dumper); + a6xx_get_dbgahb_clusters(gpu, a6xx_state, dumper); - msm_gem_kernel_put(dumper.bo, gpu->aspace, true); + msm_gem_kernel_put(dumper->bo, gpu->aspace, true); } if (snapshot_debugbus) diff --git a/drivers/gpu/drm/msm/msm_debugfs.c b/drivers/gpu/drm/msm/msm_debugfs.c index 7a2b53d35e6b..90558e826934 100644 --- a/drivers/gpu/drm/msm/msm_debugfs.c +++ b/drivers/gpu/drm/msm/msm_debugfs.c @@ -77,7 +77,7 @@ static int msm_gpu_open(struct inode *inode, struct file *file) goto free_priv; pm_runtime_get_sync(&gpu->pdev->dev); - show_priv->state = gpu->funcs->gpu_state_get(gpu); + show_priv->state = gpu->funcs->gpu_state_get(gpu, false); pm_runtime_put_sync(&gpu->pdev->dev); mutex_unlock(&dev->struct_mutex); diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index fa7691cb4614..4d280bf446e6 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -381,7 +381,8 @@ static void msm_gpu_crashstate_get_bo(struct msm_gpu_state *state, } static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, - struct msm_gem_submit *submit, char *comm, char *cmd) + struct msm_gem_submit *submit, char *comm, char *cmd, + bool stalled) { struct msm_gpu_state *state; @@ -393,7 +394,7 @@ static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, if (gpu->crashstate) return; - state = gpu->funcs->gpu_state_get(gpu); + state = gpu->funcs->gpu_state_get(gpu, stalled); if (IS_ERR_OR_NULL(state)) return; @@ -519,7 +520,7 @@ static void recover_worker(struct kthread_work *work) /* Record the crash state */ pm_runtime_get_sync(&gpu->pdev->dev); - msm_gpu_crashstate_capture(gpu, submit, comm, cmd); + msm_gpu_crashstate_capture(gpu, submit, comm, cmd, false); pm_runtime_put_sync(&gpu->pdev->dev); kfree(cmd); diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 7a082a12d98f..c15e5fd675d2 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -60,7 +60,7 @@ struct msm_gpu_funcs { void (*debugfs_init)(struct msm_gpu *gpu, struct drm_minor *minor); #endif unsigned long (*gpu_busy)(struct msm_gpu *gpu); - struct msm_gpu_state *(*gpu_state_get)(struct msm_gpu *gpu); + struct msm_gpu_state *(*gpu_state_get)(struct msm_gpu *gpu, bool stalled); int (*gpu_state_put)(struct msm_gpu_state *state); unsigned long (*gpu_get_freq)(struct msm_gpu *gpu); void (*gpu_set_freq)(struct msm_gpu *gpu, struct dev_pm_opp *opp); From patchwork Tue Jun 1 22:47:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rob Clark X-Patchwork-Id: 12292383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8246BC4708F for ; Tue, 1 Jun 2021 22:44:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5F898613BC for ; Tue, 1 Jun 2021 22:44:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235113AbhFAWqW (ORCPT ); Tue, 1 Jun 2021 18:46:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53132 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235080AbhFAWqT (ORCPT ); Tue, 1 Jun 2021 18:46:19 -0400 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 23C08C061574; Tue, 1 Jun 2021 15:44:38 -0700 (PDT) Received: by mail-pl1-x62b.google.com with SMTP id z4so77189plg.8; Tue, 01 Jun 2021 15:44:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DP1YfdywWVyCkPxUu5K+y6vptB6uD3iQf4y8uHSI9wc=; b=Ey82mHQF/GUYn2qYc/zc+1sb4QPT0AnYIMxOrNMl01Yyen2Zxo3G/oQV3Om/1Ja5Sz gO9ATvgwOf+MWBdOM96T1B7RMj6A/3tvDZ1/qmnesa5mBmt5eh2y9t55IdR6Sw2R1D58 TTjqVKIw+nG274zjp46CWaPyEhQG/PRdrZvlEasE1uvK7P0qBwx2y9hmdlk9d/uyCyni I4XFabWl85PTzXpRBQI8GKYskdocg4vtxn7wc2xqFTvrrHw0AWNm+BRZnVVB9Zf5jDff mT0RoDcJkGBuVkMkKaGLP3orYQ3s+p5mo8vkEn2A9/4Ex5RrKtbRlaw32JgOZ27DDBrT x0DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DP1YfdywWVyCkPxUu5K+y6vptB6uD3iQf4y8uHSI9wc=; b=eAftNWOwDk+DWg6UH1d4SRtUDrkxgsIoYetCu1WtSzfN3LsVOQrA9Yc4h8KWG01g5t IDjYM9dPcwGL1Pp3C8QDq+QpyXHi7QlPQ9Y9alk5ejBM1bZcKitufjtXJx4ruWP7glik 1UufeHtPFeis9Njx8wEbYjjzKYrw1kLddPMGf1m0FpWFkziRHuRyNCWZJNgHgAmwabti /B+XoNCSSZWYSAZvdE1U0jz7WsuFe412qnjEUSGbfOw6+0EFE5P0DLO2OEC86t4tXQa1 b4seNZ8RtAWbtYJg8n+3bysh15pkxU1HN8IOreGOamfGksExq44pNljcP23fRsd7kytQ u5vQ== X-Gm-Message-State: AOAM530drweiGGkKLmeF03fV3/1Vjb/5goJSsqqDXqIWJ7EJKHWoMjU2 DXyTLN9XKxESz8jvxL49nYg= X-Google-Smtp-Source: ABdhPJzZBjX8GeX9agDLi9ZI5KEdhTTc/cZkYTORZDv2aaLnDPHLkdknwu1Cb+nrWHHJFjo2eIW7aA== X-Received: by 2002:a17:902:b218:b029:f4:4b88:a44a with SMTP id t24-20020a170902b218b02900f44b88a44amr28166065plr.52.1622587477600; Tue, 01 Jun 2021 15:44:37 -0700 (PDT) Received: from localhost (c-73-25-156-94.hsd1.or.comcast.net. [73.25.156.94]) by smtp.gmail.com with ESMTPSA id g6sm5029941pfq.110.2021.06.01.15.44.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Jun 2021 15:44:36 -0700 (PDT) From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: Jordan Crouse , Rob Clark , Rob Clark , Sean Paul , David Airlie , Daniel Vetter , Sai Prakash Ranjan , Jonathan Marek , Akhil P Oommen , Eric Anholt , Sharat Masetty , Douglas Anderson , Bjorn Andersson , linux-arm-msm@vger.kernel.org (open list:DRM DRIVER FOR MSM ADRENO GPU), freedreno@lists.freedesktop.org (open list:DRM DRIVER FOR MSM ADRENO GPU), linux-kernel@vger.kernel.org (open list) Subject: [PATCH v4 6/6] drm/msm: devcoredump iommu fault support Date: Tue, 1 Jun 2021 15:47:25 -0700 Message-Id: <20210601224750.513996-8-robdclark@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210601224750.513996-1-robdclark@gmail.com> References: <20210601224750.513996-1-robdclark@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org From: Rob Clark Wire up support to stall the SMMU on iova fault, and collect a devcore- dump snapshot for easier debugging of faults. Currently this is a6xx-only, but mostly only because so far it is the only one using adreno-smmu-priv. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 29 +++++++++++++-- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 15 ++++++++ drivers/gpu/drm/msm/msm_gem.h | 1 + drivers/gpu/drm/msm/msm_gem_submit.c | 1 + drivers/gpu/drm/msm/msm_gpu.c | 48 +++++++++++++++++++++++++ drivers/gpu/drm/msm/msm_gpu.h | 17 +++++++++ drivers/gpu/drm/msm/msm_gpummu.c | 5 +++ drivers/gpu/drm/msm/msm_iommu.c | 11 ++++++ drivers/gpu/drm/msm/msm_mmu.h | 1 + 9 files changed, 126 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 094dc17fd20f..0dcde917e575 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1008,6 +1008,16 @@ static int a6xx_fault_handler(void *arg, unsigned long iova, int flags, void *da struct msm_gpu *gpu = arg; struct adreno_smmu_fault_info *info = data; const char *type = "UNKNOWN"; + const char *block; + bool do_devcoredump = info && !READ_ONCE(gpu->crashstate); + + /* + * If we aren't going to be resuming later from fault_worker, then do + * it now. + */ + if (!do_devcoredump) { + gpu->aspace->mmu->funcs->resume_translation(gpu->aspace->mmu); + } /* * Print a default message if we couldn't get the data from the @@ -1031,15 +1041,30 @@ static int a6xx_fault_handler(void *arg, unsigned long iova, int flags, void *da else if (info->fsr & ARM_SMMU_FSR_EF) type = "EXTERNAL"; + block = a6xx_fault_block(gpu, info->fsynr1 & 0xff); + pr_warn_ratelimited("*** gpu fault: ttbr0=%.16llx iova=%.16lx dir=%s type=%s source=%s (%u,%u,%u,%u)\n", info->ttbr0, iova, - flags & IOMMU_FAULT_WRITE ? "WRITE" : "READ", type, - a6xx_fault_block(gpu, info->fsynr1 & 0xff), + flags & IOMMU_FAULT_WRITE ? "WRITE" : "READ", + type, block, gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(4)), gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(5)), gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(6)), gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(7))); + if (do_devcoredump) { + /* Turn off the hangcheck timer to keep it from bothering us */ + del_timer(&gpu->hangcheck_timer); + + gpu->fault_info.ttbr0 = info->ttbr0; + gpu->fault_info.iova = iova; + gpu->fault_info.flags = flags; + gpu->fault_info.type = type; + gpu->fault_info.block = block; + + kthread_queue_work(gpu->worker, &gpu->fault_work); + } + return 0; } diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index cf897297656f..4e88d4407667 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -684,6 +684,21 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state *state, adreno_gpu->info->revn, adreno_gpu->rev.core, adreno_gpu->rev.major, adreno_gpu->rev.minor, adreno_gpu->rev.patchid); + /* + * If this is state collected due to iova fault, so fault related info + * + * TTBR0 would not be zero, so this is a good way to distinguish + */ + if (state->fault_info.ttbr0) { + const struct msm_gpu_fault_info *info = &state->fault_info; + + drm_puts(p, "fault-info:\n"); + drm_printf(p, " - ttbr0=%.16llx\n", info->ttbr0); + drm_printf(p, " - iova=%.16lx\n", info->iova); + drm_printf(p, " - dir=%s\n", info->flags & IOMMU_FAULT_WRITE ? "WRITE" : "READ"); + drm_printf(p, " - type=%s\n", info->type); + drm_printf(p, " - source=%s\n", info->block); + } drm_printf(p, "rbbm-status: 0x%08x\n", state->rbbm_status); diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index 03e2cc2a2ce1..405f8411e395 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -328,6 +328,7 @@ struct msm_gem_submit { struct dma_fence *fence; struct msm_gpu_submitqueue *queue; struct pid *pid; /* submitting process */ + bool fault_dumped; /* Limit devcoredump dumping to one per submit */ bool valid; /* true if no cmdstream patching needed */ bool in_rb; /* "sudo" mode, copy cmds into RB */ struct msm_ringbuffer *ring; diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 5480852bdeda..44f84bfd0c0e 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -50,6 +50,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, submit->cmd = (void *)&submit->bos[nr_bos]; submit->queue = queue; submit->ring = gpu->rb[queue->prio]; + submit->fault_dumped = false; /* initially, until copy_from_user() and bo lookup succeeds: */ submit->nr_bos = 0; diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 4d280bf446e6..4da2053c1ffb 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -401,6 +401,7 @@ static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, /* Fill in the additional crash state information */ state->comm = kstrdup(comm, GFP_KERNEL); state->cmd = kstrdup(cmd, GFP_KERNEL); + state->fault_info = gpu->fault_info; if (submit) { int i, nr = 0; @@ -573,6 +574,52 @@ static void recover_worker(struct kthread_work *work) msm_gpu_retire(gpu); } +static void fault_worker(struct kthread_work *work) +{ + struct msm_gpu *gpu = container_of(work, struct msm_gpu, fault_work); + struct drm_device *dev = gpu->dev; + struct msm_gem_submit *submit; + struct msm_ringbuffer *cur_ring = gpu->funcs->active_ring(gpu); + char *comm = NULL, *cmd = NULL; + + mutex_lock(&dev->struct_mutex); + + submit = find_submit(cur_ring, cur_ring->memptrs->fence + 1); + if (submit && submit->fault_dumped) + goto resume_smmu; + + if (submit) { + struct task_struct *task; + + task = get_pid_task(submit->pid, PIDTYPE_PID); + if (task) { + comm = kstrdup(task->comm, GFP_KERNEL); + cmd = kstrdup_quotable_cmdline(task, GFP_KERNEL); + put_task_struct(task); + } + + /* + * When we get GPU iova faults, we can get 1000s of them, + * but we really only want to log the first one. + */ + submit->fault_dumped = true; + } + + /* Record the crash state */ + pm_runtime_get_sync(&gpu->pdev->dev); + msm_gpu_crashstate_capture(gpu, submit, comm, cmd, true); + pm_runtime_put_sync(&gpu->pdev->dev); + + kfree(cmd); + kfree(comm); + +resume_smmu: + memset(&gpu->fault_info, 0, sizeof(gpu->fault_info)); + gpu->aspace->mmu->funcs->resume_translation(gpu->aspace->mmu); + + mutex_unlock(&dev->struct_mutex); +} + static void hangcheck_timer_reset(struct msm_gpu *gpu) { mod_timer(&gpu->hangcheck_timer, @@ -949,6 +996,7 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev, INIT_LIST_HEAD(&gpu->active_list); kthread_init_work(&gpu->retire_work, retire_worker); kthread_init_work(&gpu->recover_work, recover_worker); + kthread_init_work(&gpu->fault_work, fault_worker); timer_setup(&gpu->hangcheck_timer, hangcheck_handler, 0); diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index c15e5fd675d2..8dae601085ee 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -71,6 +71,15 @@ struct msm_gpu_funcs { uint32_t (*get_rptr)(struct msm_gpu *gpu, struct msm_ringbuffer *ring); }; +/* Additional state for iommu faults: */ +struct msm_gpu_fault_info { + u64 ttbr0; + unsigned long iova; + int flags; + const char *type; + const char *block; +}; + struct msm_gpu { const char *name; struct drm_device *dev; @@ -135,6 +144,12 @@ struct msm_gpu { #define DRM_MSM_HANGCHECK_JIFFIES msecs_to_jiffies(DRM_MSM_HANGCHECK_PERIOD) struct timer_list hangcheck_timer; + /* Fault info for most recent iova fault: */ + struct msm_gpu_fault_info fault_info; + + /* work for handling GPU ioval faults: */ + struct kthread_work fault_work; + /* work for handling GPU recovery: */ struct kthread_work recover_work; @@ -243,6 +258,8 @@ struct msm_gpu_state { char *comm; char *cmd; + struct msm_gpu_fault_info fault_info; + int nr_bos; struct msm_gpu_state_bo *bos; }; diff --git a/drivers/gpu/drm/msm/msm_gpummu.c b/drivers/gpu/drm/msm/msm_gpummu.c index 379496186c7f..f7d1945e0c9f 100644 --- a/drivers/gpu/drm/msm/msm_gpummu.c +++ b/drivers/gpu/drm/msm/msm_gpummu.c @@ -68,6 +68,10 @@ static int msm_gpummu_unmap(struct msm_mmu *mmu, uint64_t iova, size_t len) return 0; } +static void msm_gpummu_resume_translation(struct msm_mmu *mmu) +{ +} + static void msm_gpummu_destroy(struct msm_mmu *mmu) { struct msm_gpummu *gpummu = to_msm_gpummu(mmu); @@ -83,6 +87,7 @@ static const struct msm_mmu_funcs funcs = { .map = msm_gpummu_map, .unmap = msm_gpummu_unmap, .destroy = msm_gpummu_destroy, + .resume_translation = msm_gpummu_resume_translation, }; struct msm_mmu *msm_gpummu_new(struct device *dev, struct msm_gpu *gpu) diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c index 6975b95c3c29..eed2a762e9dd 100644 --- a/drivers/gpu/drm/msm/msm_iommu.c +++ b/drivers/gpu/drm/msm/msm_iommu.c @@ -184,6 +184,9 @@ struct msm_mmu *msm_iommu_pagetable_create(struct msm_mmu *parent) * the arm-smmu driver as a trigger to set up TTBR0 */ if (atomic_inc_return(&iommu->pagetables) == 1) { + /* Enable stall on iommu fault: */ + adreno_smmu->set_stall(adreno_smmu->cookie, true); + ret = adreno_smmu->set_ttbr0_cfg(adreno_smmu->cookie, &ttbr0_cfg); if (ret) { free_io_pgtable_ops(pagetable->pgtbl_ops); @@ -226,6 +229,13 @@ static int msm_fault_handler(struct iommu_domain *domain, struct device *dev, return 0; } +static void msm_iommu_resume_translation(struct msm_mmu *mmu) +{ + struct adreno_smmu_priv *adreno_smmu = dev_get_drvdata(mmu->dev); + + adreno_smmu->resume_translation(adreno_smmu->cookie, true); +} + static void msm_iommu_detach(struct msm_mmu *mmu) { struct msm_iommu *iommu = to_msm_iommu(mmu); @@ -273,6 +283,7 @@ static const struct msm_mmu_funcs funcs = { .map = msm_iommu_map, .unmap = msm_iommu_unmap, .destroy = msm_iommu_destroy, + .resume_translation = msm_iommu_resume_translation, }; struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain) diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h index a88f44c3268d..de158e1bf765 100644 --- a/drivers/gpu/drm/msm/msm_mmu.h +++ b/drivers/gpu/drm/msm/msm_mmu.h @@ -15,6 +15,7 @@ struct msm_mmu_funcs { size_t len, int prot); int (*unmap)(struct msm_mmu *mmu, uint64_t iova, size_t len); void (*destroy)(struct msm_mmu *mmu); + void (*resume_translation)(struct msm_mmu *mmu); }; enum msm_mmu_type {