diff mbox

[2/7] drm/i915/skl: Refuse to load outdated dmc firmware

Message ID 1446220336-32392-1-git-send-email-mika.kuoppala@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Mika Kuoppala Oct. 30, 2015, 3:52 p.m. UTC
There is known issue on GT interrupt delivery with DC6 and
firmwares <1.21. There is a suspicion that this causes
spurious gpu hangs on driver init and with some workloads,
as upgrading the firmware to 1.21 makes these problems
disappear.

As of now the current version included in distribution
firmware packages is very like to be 1.19. Play it safe and
refuse to load a firmware version that may affect gpu
side stability.

With < 1.23 there is a palette and dmc ram corruption issue
so blacklist anything below that.

v2: Refuse to load fw instead of notifying the user
v3: Rebase on header version changes
v4: Refuse to load anything less than 1.23
v5: Give enough information for user for finding correct fw (Chris)
v6: better url and formatting (Chris)
v7: move error log for each fail path (Mika)
    bail out earlier in load path (Imre)
v8: Fix the version check (Imre)

Cc: Animesh Manna <animesh.manna@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
References: https://01.org/linuxgraphics/downloads/skldmcver121
References: https://01.org/linuxgraphics/downloads/skylake-dmc-1.23
Testcase: igt/gem_exec_nop
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/intel_csr.c | 34 ++++++++++++++++++++++++----------
 1 file changed, 24 insertions(+), 10 deletions(-)

Comments

Daniel Stone Nov. 3, 2015, 9:49 p.m. UTC | #1
Hi Mika,

On 30 October 2015 at 15:52, Mika Kuoppala
<mika.kuoppala@linux.intel.com> wrote:
> There is known issue on GT interrupt delivery with DC6 and
> firmwares <1.21. There is a suspicion that this causes
> spurious gpu hangs on driver init and with some workloads,
> as upgrading the firmware to 1.21 makes these problems
> disappear.
>
> As of now the current version included in distribution
> firmware packages is very like to be 1.19. Play it safe and
> refuse to load a firmware version that may affect gpu
> side stability.
>
> With < 1.23 there is a palette and dmc ram corruption issue
> so blacklist anything below that.

Unfortunately 1.23 is only available from 01.org, and doesn't appear
to have been submitted to linux-firmware. Rodrigo, is this going to be
submitted soon, or?

Cheers,
Daniel
Rodrigo Vivi Nov. 3, 2015, 11:23 p.m. UTC | #2
On Tue, 2015-11-03 at 21:49 +0000, Daniel Stone wrote:
> Hi Mika,

> 

> On 30 October 2015 at 15:52, Mika Kuoppala

> <mika.kuoppala@linux.intel.com> wrote:

> > There is known issue on GT interrupt delivery with DC6 and

> > firmwares <1.21. There is a suspicion that this causes

> > spurious gpu hangs on driver init and with some workloads,

> > as upgrading the firmware to 1.21 makes these problems

> > disappear.

> > 

> > As of now the current version included in distribution

> > firmware packages is very like to be 1.19. Play it safe and

> > refuse to load a firmware version that may affect gpu

> > side stability.

> > 

> > With < 1.23 there is a palette and dmc ram corruption issue

> > so blacklist anything below that.

> 

> Unfortunately 1.23 is only available from 01.org, and doesn't appear

> to have been submitted to linux-firmware. Rodrigo, is this going to 

> be

> submitted soon, or?


I submit along with the release at 01.org.
It just got pulled and merged there.


> Cheers,

> Daniel


Thanks,
Rodrigo.
Daniel Stone Nov. 4, 2015, 9:51 a.m. UTC | #3
Hi Rodrigo,

On 3 November 2015 at 23:23, Vivi, Rodrigo <rodrigo.vivi@intel.com> wrote:
> On Tue, 2015-11-03 at 21:49 +0000, Daniel Stone wrote:
>> On 30 October 2015 at 15:52, Mika Kuoppala
>> <mika.kuoppala@linux.intel.com> wrote:
>> > With < 1.23 there is a palette and dmc ram corruption issue
>> > so blacklist anything below that.
>>
>> Unfortunately 1.23 is only available from 01.org, and doesn't appear
>> to have been submitted to linux-firmware. Rodrigo, is this going to
>> be
>> submitted soon, or?
>
> I submit along with the release at 01.org.
> It just got pulled and merged there.

Yes, you're right - the reason I couldn't find the submission anywhere
is because linux-firmware@ is just an alias rather than an actual
archived list! So yes, it was actually submitted, and got merged last
night as well. Sorry about that.

Cheers,
Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
index e620e85..25b6ba7 100644
--- a/drivers/gpu/drm/i915/intel_csr.c
+++ b/drivers/gpu/drm/i915/intel_csr.c
@@ -47,6 +47,8 @@ 
 MODULE_FIRMWARE(I915_CSR_SKL);
 MODULE_FIRMWARE(I915_CSR_BXT);
 
+#define SKL_CSR_VERSION_REQUIRED	CSR_VERSION(1, 23)
+
 /*
 * SKL CSR registers for DC5 and DC6
 */
@@ -303,10 +305,8 @@  static void finish_csr_load(const struct firmware *fw, void *context)
 	uint32_t *dmc_payload;
 	bool fw_loaded = false;
 
-	if (!fw) {
-		i915_firmware_load_error_print(csr->fw_path, 0);
+	if (!fw)
 		goto out;
-	}
 
 	if ((stepping == -ENODATA) || (substepping == -ENODATA)) {
 		DRM_ERROR("Unknown stepping info, firmware loading failed\n");
@@ -324,6 +324,17 @@  static void finish_csr_load(const struct firmware *fw, void *context)
 
 	csr->version = css_header->version;
 
+	if (IS_SKYLAKE(dev) && csr->version < SKL_CSR_VERSION_REQUIRED) {
+		DRM_INFO("Refusing to load old Skylake DMC firmware v%u.%u,"
+			 " please upgrade to v%u.%u or later"
+			 " [https://01.org/linuxgraphics/intel-linux-graphics-firmwares].\n",
+			 CSR_VERSION_MAJOR(csr->version),
+			 CSR_VERSION_MINOR(csr->version),
+			 CSR_VERSION_MAJOR(SKL_CSR_VERSION_REQUIRED),
+			 CSR_VERSION_MINOR(SKL_CSR_VERSION_REQUIRED));
+		goto out;
+	}
+
 	readcount += sizeof(struct intel_css_header);
 
 	/* Extract Package Header information*/
@@ -405,17 +416,20 @@  static void finish_csr_load(const struct firmware *fw, void *context)
 	intel_csr_load_program(dev);
 	fw_loaded = true;
 
-	DRM_INFO("Finished loading %s (v%u.%u)\n",
-		 dev_priv->csr.fw_path,
-		 CSR_VERSION_MAJOR(csr->version),
-		 CSR_VERSION_MINOR(csr->version));
-
 out:
-	if (fw_loaded)
+	if (fw_loaded) {
 		intel_runtime_pm_put(dev_priv);
-	else
+
+		DRM_INFO("Finished loading %s (v%u.%u)\n",
+			 dev_priv->csr.fw_path,
+			 CSR_VERSION_MAJOR(csr->version),
+			 CSR_VERSION_MINOR(csr->version));
+	} else {
 		intel_csr_load_status_set(dev_priv, FW_FAILED);
 
+		i915_firmware_load_error_print(csr->fw_path, 0);
+	}
+
 	release_firmware(fw);
 }