From patchwork Fri Feb 12 21:19:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12086103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3541C433DB for ; Fri, 12 Feb 2021 21:19:42 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7923364E13 for ; Fri, 12 Feb 2021 21:19:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7923364E13 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0EC546E7FE; Fri, 12 Feb 2021 21:19:42 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 64DA86E7FE for ; Fri, 12 Feb 2021 21:19:41 +0000 (UTC) IronPort-SDR: AAnhtdqdSl5eNsPp4DoJJHyTFqfYeyXA2CvAS9k4WkI/ZQNooI61rvyFxe1dHo6I6IYSFzKads pht+EBIgOMug== X-IronPort-AV: E=McAfee;i="6000,8403,9893"; a="182609133" X-IronPort-AV: E=Sophos;i="5.81,174,1610438400"; d="scan'208";a="182609133" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Feb 2021 13:19:40 -0800 IronPort-SDR: I8BUv5KAkz1X3ZtPGjqCSoYQ0UXyAVa7jR+W217kMQURHONSd/LdYuz2nPatlMsbfSRi342VuT ZtGkUOB/JzUA== X-IronPort-AV: E=Sophos;i="5.81,174,1610438400"; d="scan'208";a="381867046" Received: from mdroper-desk1.fm.intel.com ([10.1.27.168]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Feb 2021 13:19:40 -0800 From: Matt Roper To: intel-gfx@lists.freedesktop.org Date: Fri, 12 Feb 2021 13:19:24 -0800 Message-Id: <20210212211925.3418280-1-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.25.4 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/2] drm/i915: FPGA_DBG is display-specific X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lucas De Marchi Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Although the bspec's description doesn't make it very clear, the hardware architects have confirmed that the FPGA_DBG register that we use to check for unclaimed MMIO accesses is display-specific and will only properly flag unclaimed MMIO transactions for registers in the display range. If a platform doesn't have display, FPGA_DBG itself will not be available and should not be checked. Let's move the feature flag into intel_device_info.display to more accurately reflect this. Given that we now know FPGA_DBG is display-specific, it could be argued that we should only check it on out intel_de_*() functions. However let's not make that change right now; keeping the checks in all of the existing locations still helps us catch cases where regular intel_uncore_*() functions use bad MMIO offset math / base addresses and accidentally wind up landing within an unused area within the display MMIO range. It will also help catch cases where userspace-initiated MMIO (e.g., IGT's intel_reg tool) attempt to read bad offsets within the display range. Cc: Lucas De Marchi Signed-off-by: Matt Roper Reviewed-by: Lucas De Marchi --- drivers/gpu/drm/i915/i915_pci.c | 4 ++-- drivers/gpu/drm/i915/intel_device_info.h | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index eff7155db2fd..a9f24f2bda33 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -538,7 +538,7 @@ static const struct intel_device_info vlv_info = { .cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | \ BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP), \ .display.has_ddi = 1, \ - .has_fpga_dbg = 1, \ + .display.has_fpga_dbg = 1, \ .display.has_psr = 1, \ .display.has_psr_hw_tracking = 1, \ .display.has_dp_mst = 1, \ @@ -689,7 +689,7 @@ static const struct intel_device_info skl_gt4_info = { BIT(TRANSCODER_DSI_A) | BIT(TRANSCODER_DSI_C), \ .has_64bit_reloc = 1, \ .display.has_ddi = 1, \ - .has_fpga_dbg = 1, \ + .display.has_fpga_dbg = 1, \ .display.has_fbc = 1, \ .display.has_hdcp = 1, \ .display.has_psr = 1, \ diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index e6ca1023ffcf..d44f64b57b7a 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -118,7 +118,6 @@ enum intel_ppgtt_type { func(has_64bit_reloc); \ func(gpu_reset_clobbers_display); \ func(has_reset_engine); \ - func(has_fpga_dbg); \ func(has_global_mocs); \ func(has_gt_uc); \ func(has_l3_dpf); \ @@ -145,6 +144,7 @@ enum intel_ppgtt_type { func(has_dsb); \ func(has_dsc); \ func(has_fbc); \ + func(has_fpga_dbg); \ func(has_gmch); \ func(has_hdcp); \ func(has_hotplug); \ From patchwork Fri Feb 12 21:19:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12086105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65B7DC433E0 for ; Fri, 12 Feb 2021 21:19:45 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0B19864E08 for ; Fri, 12 Feb 2021 21:19:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0B19864E08 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AB5D26E802; Fri, 12 Feb 2021 21:19:44 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 32B4A6E802 for ; Fri, 12 Feb 2021 21:19:42 +0000 (UTC) IronPort-SDR: BXpvsexUvbd0o7pQ57ag9a2WjAyRKrV5xZq+6jPKjYlwqFffnet8UF9OTyxrpcXPI9tu1SFrr7 x/lq9iseSafQ== X-IronPort-AV: E=McAfee;i="6000,8403,9893"; a="182609135" X-IronPort-AV: E=Sophos;i="5.81,174,1610438400"; d="scan'208";a="182609135" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Feb 2021 13:19:42 -0800 IronPort-SDR: 1CzD4aE23+Z9qbI1JUNVstfaRFBZ5nw/xLvr+S/FUpho+dxv3dVHATueceA97vrCM6Zp652KWf q1+mTLFcgs9w== X-IronPort-AV: E=Sophos;i="5.81,174,1610438400"; d="scan'208";a="381867055" Received: from mdroper-desk1.fm.intel.com ([10.1.27.168]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Feb 2021 13:19:41 -0800 From: Matt Roper To: intel-gfx@lists.freedesktop.org Date: Fri, 12 Feb 2021 13:19:25 -0800 Message-Id: <20210212211925.3418280-2-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20210212211925.3418280-1-matthew.d.roper@intel.com> References: <20210212211925.3418280-1-matthew.d.roper@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/2] drm/i915: Try to detect sudden loss of MMIO access X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lucas De Marchi Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" In rare circumstances bugs in PCI programming, broken BIOS, or failing hardware can cause the CPU to lose access to the MMIO BAR on dgfx platforms. This is a pretty catastrophic failure since all register reads come back with values of 0xFFFFFFFF. Let's check for this special case while doing our usual checks for unclaimed registers; the FPGA_DBG register we use for those checks on modern platforms has some unused bits that will always read back as 0 when things are behaving properly; we can use them as canaries to detect when MMIO itself has suddenly broken and try to print a more informative error message in the logs. v2: Let the detection function still return 'true' if we've lost our MMIO access. We'll still get an extra false positive message about an unclaimed register access, but we'll still honor the 'mmio_debug' limit and not spam the log. (Lucas) Cc: Lucas De Marchi Signed-off-by: Matt Roper Reviewed-by: Lucas De Marchi --- drivers/gpu/drm/i915/intel_uncore.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 5098f95d71b0..661b50191f2b 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -465,6 +465,22 @@ fpga_check_for_unclaimed_mmio(struct intel_uncore *uncore) if (likely(!(dbg & FPGA_DBG_RM_NOCLAIM))) return false; + /* + * Bugs in PCI programming (or failing hardware) can occasionally cause + * us to lose access to the MMIO BAR. When this happens, register + * reads will come back with 0xFFFFFFFF for every register and things + * go bad very quickly. Let's try to detect that special case and at + * least try to print a more informative message about what has + * happened. + * + * During normal operation the FPGA_DBG register has several unused + * bits that will always read back as 0's so we can use them as canaries + * to recognize when MMIO accesses are just busted. + */ + if (unlikely(dbg == ~0)) + drm_err(&uncore->i915->drm, + "Lost access to MMIO BAR; all registers now read back as 0xFFFFFFFF!\n"); + __raw_uncore_write32(uncore, FPGA_DBG, FPGA_DBG_RM_NOCLAIM); return true;