diff mbox

[v4] drm/i915/mocs: Program MOCS for all engines on init

Message ID 1457968262-8045-1-git-send-email-peter.antoine@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Peter Antoine March 14, 2016, 3:11 p.m. UTC
Allow for the MOCS to be programmed for all engines.
Currently we program the MOCS when the first render batch
goes through. This works on most platforms but fails on
platforms that do not run a render batch early,
i.e. headless servers. The patch now programs all initialised engines
on init and the RCS is programmed again within the initial batch. This
is done for predictable consistency with regards to the hardware
context.

Hardware context loading sets the values of the MOCS for RCS
and L3CC. Programming them from within the batch makes sure that
the render context is valid, no matter what the previous state of
the saved-context was.

v2: posted correct version to the mailing list.
v3: moved programming to within engine->init_hw() (Chris Wilson)
v4: code formatting and white-space changes. (Chris Wilson)

Signed-off-by: Peter Antoine <peter.antoine@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c   |   3 +
 drivers/gpu/drm/i915/intel_lrc.c  |   2 +-
 drivers/gpu/drm/i915/intel_mocs.c | 132 ++++++++++++++++++++++++++++++--------
 drivers/gpu/drm/i915/intel_mocs.h |   2 +
 4 files changed, 110 insertions(+), 29 deletions(-)

Comments

Chris Wilson March 15, 2016, 10:15 a.m. UTC | #1
On Mon, Mar 14, 2016 at 03:11:02PM +0000, Peter Antoine wrote:
> Allow for the MOCS to be programmed for all engines.
> Currently we program the MOCS when the first render batch
> goes through. This works on most platforms but fails on
> platforms that do not run a render batch early,
> i.e. headless servers. The patch now programs all initialised engines
> on init and the RCS is programmed again within the initial batch. This
> is done for predictable consistency with regards to the hardware
> context.
> 
> Hardware context loading sets the values of the MOCS for RCS
> and L3CC. Programming them from within the batch makes sure that
> the render context is valid, no matter what the previous state of
> the saved-context was.
> 
> v2: posted correct version to the mailing list.
> v3: moved programming to within engine->init_hw() (Chris Wilson)
> v4: code formatting and white-space changes. (Chris Wilson)
> 
> Signed-off-by: Peter Antoine <peter.antoine@intel.com>

So testcase?

Execute a bunch of MI_STORE_REGISTER_MEM on the various rings in a fresh
context each time and confirm the ABI for the first N locations.

Repeat across suspend/resume (i.e. make sure the context image maintain
the register state). Then verify that freshly constructed contexts also
have the correct settings after resume.
-Chris
Peter Antoine March 15, 2016, 2:40 p.m. UTC | #2
Chris,

Testcases are underway, validation are working on them.

Peter.

On Tue, 15 Mar 2016, Chris Wilson wrote:

> On Mon, Mar 14, 2016 at 03:11:02PM +0000, Peter Antoine wrote:
>> Allow for the MOCS to be programmed for all engines.
>> Currently we program the MOCS when the first render batch
>> goes through. This works on most platforms but fails on
>> platforms that do not run a render batch early,
>> i.e. headless servers. The patch now programs all initialised engines
>> on init and the RCS is programmed again within the initial batch. This
>> is done for predictable consistency with regards to the hardware
>> context.
>>
>> Hardware context loading sets the values of the MOCS for RCS
>> and L3CC. Programming them from within the batch makes sure that
>> the render context is valid, no matter what the previous state of
>> the saved-context was.
>>
>> v2: posted correct version to the mailing list.
>> v3: moved programming to within engine->init_hw() (Chris Wilson)
>> v4: code formatting and white-space changes. (Chris Wilson)
>>
>> Signed-off-by: Peter Antoine <peter.antoine@intel.com>
>
> So testcase?
>
> Execute a bunch of MI_STORE_REGISTER_MEM on the various rings in a fresh
> context each time and confirm the ABI for the first N locations.
>
> Repeat across suspend/resume (i.e. make sure the context image maintain
> the register state). Then verify that freshly constructed contexts also
> have the correct settings after resume.
> -Chris
>
>

--
    Peter Antoine (Android Graphics Driver Software Engineer)
    ---------------------------------------------------------------------
    Intel Corporation (UK) Limited
    Registered No. 1134945 (England)
    Registered Office: Pipers Way, Swindon SN3 1RJ
    VAT No: 860 2173 47
Peter Antoine March 22, 2016, 9:02 a.m. UTC | #3
Chris.

Can these patches be reviewed without the tests or are they blocked 
waiting for the tests.

Are the patches acceptable?

Thanks,
Peter.

On Tue, 15 Mar 2016, Peter Antoine wrote:

> Chris,
>
> Testcases are underway, validation are working on them.
>
> Peter.
>
> On Tue, 15 Mar 2016, Chris Wilson wrote:
>
>> On Mon, Mar 14, 2016 at 03:11:02PM +0000, Peter Antoine wrote:
>>> Allow for the MOCS to be programmed for all engines.
>>> Currently we program the MOCS when the first render batch
>>> goes through. This works on most platforms but fails on
>>> platforms that do not run a render batch early,
>>> i.e. headless servers. The patch now programs all initialised engines
>>> on init and the RCS is programmed again within the initial batch. This
>>> is done for predictable consistency with regards to the hardware
>>> context.
>>> 
>>> Hardware context loading sets the values of the MOCS for RCS
>>> and L3CC. Programming them from within the batch makes sure that
>>> the render context is valid, no matter what the previous state of
>>> the saved-context was.
>>> 
>>> v2: posted correct version to the mailing list.
>>> v3: moved programming to within engine->init_hw() (Chris Wilson)
>>> v4: code formatting and white-space changes. (Chris Wilson)
>>> 
>>> Signed-off-by: Peter Antoine <peter.antoine@intel.com>
>> 
>> So testcase?
>> 
>> Execute a bunch of MI_STORE_REGISTER_MEM on the various rings in a fresh
>> context each time and confirm the ABI for the first N locations.
>> 
>> Repeat across suspend/resume (i.e. make sure the context image maintain
>> the register state). Then verify that freshly constructed contexts also
>> have the correct settings after resume.
>> -Chris
>> 
>> 
>
> --
>   Peter Antoine (Android Graphics Driver Software Engineer)
>   ---------------------------------------------------------------------
>   Intel Corporation (UK) Limited
>   Registered No. 1134945 (England)
>   Registered Office: Pipers Way, Swindon SN3 1RJ
>   VAT No: 860 2173 47
>

--
    Peter Antoine (Android Graphics Driver Software Engineer)
    ---------------------------------------------------------------------
    Intel Corporation (UK) Limited
    Registered No. 1134945 (England)
    Registered Office: Pipers Way, Swindon SN3 1RJ
    VAT No: 860 2173 47
Chris Wilson March 22, 2016, 9:36 a.m. UTC | #4
On Tue, Mar 22, 2016 at 09:02:17AM +0000, Peter Antoine wrote:
> Chris.
> 
> Can these patches be reviewed without the tests or are they blocked
> waiting for the tests.

More or less waiting upon the tests. Or where is the bugzilla?

It would only take a couple of hours to write something like:

for_each_engine() {
	switch (mode) {
	case NOTHING: break;
	case RESET: reset_gpu(); break;
	case SUSPEND: suspend_autoresume(); brea;
	case HIBERNATE: hibernate_autoresume(); brea;
	}
	fd = drm_open_driver();
	if (use_context)
		ctx = gem_context_create();
	for_each_mocs()
		MI_STORE_REGISTER_MEM(engine, mocs, out[i]);
	execbuf(engine)

	gem_set_domain(out, DOMAIN_COU, 0);
	/* Check ABI caching levels */
	for_each_mocs()
		igt_assert(out[i], foo);
	close(fd);
}

> Are the patches acceptable?

Yes, with a reference to a bug demonstrating the impact and a testcase
to demonstrate it works.
-Chris
Peter Antoine April 12, 2016, 11:52 a.m. UTC | #5
Chris,

If the test is ok, can you review-by this patch.

Thanks,
Peter.

On Tue, 22 Mar 2016, Peter Antoine wrote:

> Chris.
>
> Can these patches be reviewed without the tests or are they blocked waiting 
> for the tests.
>
> Are the patches acceptable?
>
> Thanks,
> Peter.
>
> On Tue, 15 Mar 2016, Peter Antoine wrote:
>
>> Chris,
>> 
>> Testcases are underway, validation are working on them.
>> 
>> Peter.
>> 
>> On Tue, 15 Mar 2016, Chris Wilson wrote:
>> 
>>> On Mon, Mar 14, 2016 at 03:11:02PM +0000, Peter Antoine wrote:
>>>> Allow for the MOCS to be programmed for all engines.
>>>> Currently we program the MOCS when the first render batch
>>>> goes through. This works on most platforms but fails on
>>>> platforms that do not run a render batch early,
>>>> i.e. headless servers. The patch now programs all initialised engines
>>>> on init and the RCS is programmed again within the initial batch. This
>>>> is done for predictable consistency with regards to the hardware
>>>> context.
>>>> 
>>>> Hardware context loading sets the values of the MOCS for RCS
>>>> and L3CC. Programming them from within the batch makes sure that
>>>> the render context is valid, no matter what the previous state of
>>>> the saved-context was.
>>>> 
>>>> v2: posted correct version to the mailing list.
>>>> v3: moved programming to within engine->init_hw() (Chris Wilson)
>>>> v4: code formatting and white-space changes. (Chris Wilson)
>>>> 
>>>> Signed-off-by: Peter Antoine <peter.antoine@intel.com>
>>> 
>>> So testcase?
>>> 
>>> Execute a bunch of MI_STORE_REGISTER_MEM on the various rings in a fresh
>>> context each time and confirm the ABI for the first N locations.
>>> 
>>> Repeat across suspend/resume (i.e. make sure the context image maintain
>>> the register state). Then verify that freshly constructed contexts also
>>> have the correct settings after resume.
>>> -Chris
>>> 
>>> 
>> 
>> --
>>   Peter Antoine (Android Graphics Driver Software Engineer)
>>   ---------------------------------------------------------------------
>>   Intel Corporation (UK) Limited
>>   Registered No. 1134945 (England)
>>   Registered Office: Pipers Way, Swindon SN3 1RJ
>>   VAT No: 860 2173 47
>> 
>
> --
>   Peter Antoine (Android Graphics Driver Software Engineer)
>   ---------------------------------------------------------------------
>   Intel Corporation (UK) Limited
>   Registered No. 1134945 (England)
>   Registered Office: Pipers Way, Swindon SN3 1RJ
>   VAT No: 860 2173 47
>

--
    Peter Antoine (Android Graphics Driver Software Engineer)
    ---------------------------------------------------------------------
    Intel Corporation (UK) Limited
    Registered No. 1134945 (England)
    Registered Office: Pipers Way, Swindon SN3 1RJ
    VAT No: 860 2173 47
Chris Wilson April 13, 2016, 1:52 p.m. UTC | #6
On Tue, Apr 12, 2016 at 12:52:24PM +0100, Peter Antoine wrote:
> Chris,
> 
> If the test is ok, can you review-by this patch.

Yup, my box decided that was good time to suffer fs corruption. Patched
up and resent for CI with my r-b.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b854af2..73e4892 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -32,6 +32,7 @@ 
 #include "i915_vgpu.h"
 #include "i915_trace.h"
 #include "intel_drv.h"
+#include "intel_mocs.h"
 #include <linux/shmem_fs.h>
 #include <linux/slab.h>
 #include <linux/swap.h>
@@ -4882,6 +4883,8 @@  i915_gem_init_hw(struct drm_device *dev)
 			goto out;
 	}
 
+	intel_mocs_init_l3cc_table(dev);
+
 	/* We can't enable contexts until all firmware is loaded */
 	if (HAS_GUC_UCODE(dev)) {
 		ret = intel_guc_ucode_load(dev);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 27c9ee3..03ead7f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1603,7 +1603,7 @@  static int gen8_init_common_ring(struct intel_engine_cs *ring)
 
 	memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));
 
-	return 0;
+	return intel_mocs_init_engine(ring);
 }
 
 static int gen8_init_render_ring(struct intel_engine_cs *ring)
diff --git a/drivers/gpu/drm/i915/intel_mocs.c b/drivers/gpu/drm/i915/intel_mocs.c
index fed7bea..8e490af 100644
--- a/drivers/gpu/drm/i915/intel_mocs.c
+++ b/drivers/gpu/drm/i915/intel_mocs.c
@@ -128,9 +128,9 @@  static const struct drm_i915_mocs_entry broxton_mocs_table[] = {
 
 /**
  * get_mocs_settings()
- * @dev:        DRM device.
+ * @dev:	DRM device.
  * @table:      Output table that will be made to point at appropriate
- *              MOCS values for the device.
+ *	      MOCS values for the device.
  *
  * This function will return the values of the MOCS table that needs to
  * be programmed for the platform. It will return the values that need
@@ -179,6 +179,47 @@  static i915_reg_t mocs_register(enum intel_ring_id ring, int index)
 }
 
 /**
+ * intel_mocs_init_engine() - emit the mocs control table
+ * @ring:	The engine for whom to emit the registers.
+ *
+ * This function simply emits a MI_LOAD_REGISTER_IMM command for the
+ * given table starting at the given address.
+ *
+ * Return: 0 on success, otherwise the error status.
+ */
+int intel_mocs_init_engine(struct intel_engine_cs *ring)
+{
+	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_mocs_table table;
+	unsigned int index;
+
+	if (!get_mocs_settings(dev, &table))
+		return 0;
+
+	if (WARN_ON(table.size > GEN9_NUM_MOCS_ENTRIES))
+		return -ENODEV;
+
+	for (index = 0; index < table.size; index++)
+		I915_WRITE(mocs_register(ring->id, index),
+			   table.table[index].control_value);
+
+	/*
+	 * Ok, now set the unused entries to uncached. These entries
+	 * are officially undefined and no contract for the contents
+	 * and settings is given for these entries.
+	 *
+	 * Entry 0 in the table is uncached - so we are just writing
+	 * that value to all the used entries.
+	 */
+	for (; index < GEN9_NUM_MOCS_ENTRIES; index++)
+		I915_WRITE(mocs_register(ring->id, index),
+			   table.table[0].control_value);
+
+	return 0;
+}
+
+/**
  * emit_mocs_control_table() - emit the mocs control table
  * @req:	Request to set up the MOCS table for.
  * @table:	The values to program into the control regs.
@@ -234,6 +275,14 @@  static int emit_mocs_control_table(struct drm_i915_gem_request *req,
 	return 0;
 }
 
+static inline u32 l3cc_combine(const struct drm_i915_mocs_table *table,
+			       u16 low,
+			       u16 high)
+{
+	return table->table[low].l3cc_value |
+	       table->table[high].l3cc_value << 16;
+}
+
 /**
  * emit_mocs_l3cc_table() - emit the mocs control table
  * @req:	Request to set up the MOCS table for.
@@ -249,11 +298,7 @@  static int emit_mocs_l3cc_table(struct drm_i915_gem_request *req,
 				const struct drm_i915_mocs_table *table)
 {
 	struct intel_ringbuffer *ringbuf = req->ringbuf;
-	unsigned int count;
 	unsigned int i;
-	u32 value;
-	u32 filler = (table->table[0].l3cc_value & 0xffff) |
-			((table->table[0].l3cc_value & 0xffff) << 16);
 	int ret;
 
 	if (WARN_ON(table->size > GEN9_NUM_MOCS_ENTRIES))
@@ -268,20 +313,18 @@  static int emit_mocs_l3cc_table(struct drm_i915_gem_request *req,
 	intel_logical_ring_emit(ringbuf,
 			MI_LOAD_REGISTER_IMM(GEN9_NUM_MOCS_ENTRIES / 2));
 
-	for (i = 0, count = 0; i < table->size / 2; i++, count += 2) {
-		value = (table->table[count].l3cc_value & 0xffff) |
-			((table->table[count + 1].l3cc_value & 0xffff) << 16);
-
+	for (i = 0; i < table->size/2; i++) {
 		intel_logical_ring_emit_reg(ringbuf, GEN9_LNCFCMOCS(i));
-		intel_logical_ring_emit(ringbuf, value);
+		intel_logical_ring_emit(ringbuf,
+					l3cc_combine(table, 2*i, 2*i+1));
 	}
 
 	if (table->size & 0x01) {
 		/* Odd table size - 1 left over */
-		value = (table->table[count].l3cc_value & 0xffff) |
-			((table->table[0].l3cc_value & 0xffff) << 16);
-	} else
-		value = filler;
+		intel_logical_ring_emit_reg(ringbuf, GEN9_LNCFCMOCS(i));
+		intel_logical_ring_emit(ringbuf, l3cc_combine(table, 2*i, 0));
+		i++;
+	}
 
 	/*
 	 * Now set the rest of the table to uncached - use entry 0 as
@@ -290,9 +333,7 @@  static int emit_mocs_l3cc_table(struct drm_i915_gem_request *req,
 	 */
 	for (; i < GEN9_NUM_MOCS_ENTRIES / 2; i++) {
 		intel_logical_ring_emit_reg(ringbuf, GEN9_LNCFCMOCS(i));
-		intel_logical_ring_emit(ringbuf, value);
-
-		value = filler;
+		intel_logical_ring_emit(ringbuf, l3cc_combine(table, 0, 0));
 	}
 
 	intel_logical_ring_emit(ringbuf, MI_NOOP);
@@ -302,6 +343,47 @@  static int emit_mocs_l3cc_table(struct drm_i915_gem_request *req,
 }
 
 /**
+ * intel_mocs_init_l3cc_table() - program the mocs control table
+ * @dev:      The the device to be programmed.
+ *
+ * This function simply programs the mocs registers for the given table
+ * starting at the given address. This register set is  programmed in pairs.
+ *
+ * These registers may get programmed more than once, it is simpler to
+ * re-program 32 registers than maintain the state of when they were programmed.
+ * We are always reprogramming with the same values and this only on context
+ * start.
+ *
+ * Return: Nothing.
+ */
+void intel_mocs_init_l3cc_table(struct drm_device *dev)
+{
+	unsigned int i;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_mocs_table table;
+
+	if (!get_mocs_settings(dev, &table))
+		return;
+
+	for (i = 0; i < table.size/2; i++)
+		I915_WRITE(GEN9_LNCFCMOCS(i), l3cc_combine(&table, 2*i, 2*i+1));
+
+	/* Odd table size - 1 left over */
+	if (table.size & 0x01) {
+		I915_WRITE(GEN9_LNCFCMOCS(i), l3cc_combine(&table, 2*i, 0));
+		i++;
+	}
+
+	/*
+	 * Now set the rest of the table to uncached - use entry 0 as
+	 * this will be uncached. Leave the last pair as initialised as
+	 * they are reserved by the hardware.
+	 */
+	for (; i < (GEN9_NUM_MOCS_ENTRIES / 2); i++)
+		I915_WRITE(GEN9_LNCFCMOCS(i), l3cc_combine(&table, 0, 0));
+}
+
+/**
  * intel_rcs_context_init_mocs() - program the MOCS register.
  * @req:	Request to set up the MOCS tables for.
  *
@@ -323,16 +405,10 @@  int intel_rcs_context_init_mocs(struct drm_i915_gem_request *req)
 	int ret;
 
 	if (get_mocs_settings(req->ring->dev, &t)) {
-		struct drm_i915_private *dev_priv = req->i915;
-		struct intel_engine_cs *ring;
-		enum intel_ring_id ring_id;
-
-		/* Program the control registers */
-		for_each_ring(ring, dev_priv, ring_id) {
-			ret = emit_mocs_control_table(req, &t, ring_id);
-			if (ret)
-				return ret;
-		}
+		/* Program the RCS control registers */
+		ret = emit_mocs_control_table(req, &t, RCS);
+		if (ret)
+			return ret;
 
 		/* Now program the l3cc registers */
 		ret = emit_mocs_l3cc_table(req, &t);
diff --git a/drivers/gpu/drm/i915/intel_mocs.h b/drivers/gpu/drm/i915/intel_mocs.h
index 76e45b1..4640299 100644
--- a/drivers/gpu/drm/i915/intel_mocs.h
+++ b/drivers/gpu/drm/i915/intel_mocs.h
@@ -53,5 +53,7 @@ 
 #include "i915_drv.h"
 
 int intel_rcs_context_init_mocs(struct drm_i915_gem_request *req);
+void intel_mocs_init_l3cc_table(struct drm_device *dev);
+int intel_mocs_init_engine(struct intel_engine_cs *ring);
 
 #endif